For version 11.5 release of WASD VMS Web Services.
Published July 2020
Document generated using wasDOC version 2.0.0
This document provides detailed configuration instructions for the WASD Web Services package.
For installation and update details see WASD Web Services - Installation
For the more significant WASD features and facilities see WASD Web Services - Features
For information on CGI, CGIplus, ISAPI, OSU, etc., scripting, see WASD Web Services - Scripting
And for a description of WASD document, SSI and directory listing behaviours and options, WASD Web Services - Environment
WASD VMS Web Services – Copyright © 1996-2020 Mark G. Daniel
Licensed under the GNU Public License, Version 3;
This package is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
https://www.gnu.org/licenses/gpl.txt
You should have received a copy of the GNU General Public License along with this package; if not, write to the Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
[email protected]
A pox on the houses of all spamers. Make that two poxes.
All copyright and trademarks within this document belong to their rightful owners. See 13. Attribution and Acknowledgement.
This is a static (file), single document.
Alternative multi-part static
and dynamic documents.
Links followed by ⤤ open in a new page.
1.1Troubleshooting? |
↩︎ | ↖︎ | ↑︎ | ↘︎ | ↪︎ |
WASD is outlined in the Introduction and Package Overview sections of the WASD Features document.
Installation and update of the package is covered by WASD Installation.
This document provides detailed configuration instructions of the WASD Web Services package.
Following installation the package should require only minor further configuration for basic serving.
WASD configuration is performed using the contents of five files located using logical names
WASD_CONFIG_AUTH | request authorization control |
WASD_CONFIG_GLOBAL | global server configuration |
WASD_CONFIG_MAP | request processing control |
WASD_CONFIG_MSG | provides server messages |
WASD_CONFIG_SERVICE | specifies services (virtual servers) |
along with server CLI parameters commonly provide by startup DCL procedures.
Initially two files may require alteration.
More generally server runtime configuration involves the considerations discussed in 2.2 Site Organisation along with the following aspects:
When initially installing or configuring WASD, and sometimes later where something breaks spectacularly, it is most useful to be able to gain insight into what the server is up to.
The go-to tool is WATCH (yes, all capitals, and for no other reason than it makes it stand out).
WATCH is described in detail in WATCH Facility of the WASD Features and Facilities document.
For most circumstances WATCH can be made available for troubleshooting even if the configuration is significantly broken. This is done by using a skeleton-key to authorise special access into the server.
The skeleton-key is described in detail in Skeleton-Key Authentication of the WASD Features and Facilities document.
TL;DR
Enable at the command-line with the username anything beginning with an underscore and at least 8 characters, same for the password length.
Then using a browser access any available service, entering the above username (including underscore) and password when prompted.
The service administration facilities (of which WATCH is one) are also available and useful.
↩︎ | ↖︎ | ↑︎ | ↘︎ | ↪︎ |
WASD has a global configuration, which applies characteristics to the entire running server, as well as per-service (virtual server) and conditional configuration, which applies characteristics or behaviours to specific requests. All configuration is provided via files located by logical names.
Name | Scope | Description |
---|---|---|
WASD_CONFIG_AUTH | loadable | request authorization control |
WASD_CONFIG_GLOBAL | global | global server configuration |
WASD_CONFIG_MAP | loadable | request processing control |
WASD_CONFIG_MSG | global | provides server messages |
WASD_CONFIG_SERVICE | global | specifies services (virtual servers) |
Simple editing of these files change the configuration. Comment lines may be included by prefixing them with the hash ("#") character. Comment lines prefixed with a quote and then a hash ("!#") are displayed in Server Admin reports and are WATCHable during rule proceessing. Configuration file directives are not case-sensitive. Any changes to global configuration file can only be enabled by restarting the HTTPd process using the following command on the server system.
Changes to request mapping or authorization configuration files also can be dynamically reloaded into the running server using the administration command-line interface.
Changes to configuration files can be validated at the command-line before reload or restart. This detects and reports any syntactical and fatal configuration errors but of course cannot check the intent of the rules.
The config check sequentially processes each of the authorization, global, mapping, message and service configuration files.
If additional server startup qualifiers are required to enable specific configuration features then these must also be provided when checking. For example:
A server's currently loaded configuration can be interrogated from the Server Administration menu (see Server Administration of WASD Features and Facilities).
WASD uses multiple configuration files for a server and its site, each one providing for a different functional aspect … configuration, virtual services, path mapping, authorization, etc. Generally these configuration files are "flat", with all required directives included in a single file. This provides a simple and straight-forward approach suitable for most sites and allows for the provision of Server Administration page online configuration of several aspects.
It is also possible to build site configurations by including the contents of referenced files. This may provide a structure and flexibility not possible using the flat-file approach. All WASD configuration files allow the use of an [IncludeFile] directive. This takes a VMS file specification parameter. The file's contents are then loaded and processed as if part of the parent configuration file. These included files are allowed to be nested to a depth of two (i.e. the configuration file can include a file which may then include another file).
The following is an example used to build up the mapping rules for four virtual services supported on the one server.
It is recommended that the server distribution tree and any document and other web-specific data areas be kept separate and distinct.
The former in WASD_ROOT:[000000], the latter perhaps in something like WEB:[000000]. This logical device could be provided with the following DCL introduced into the site or server startup procedures:
See 10.2 VMS File System Specifications for further information on the use of logical names in locating and defining the content and structure of a site.
Note that logical device names like this need not appear in in the structure of the Web site. The root of the Web-accessible path can be concealed using a final mapping rule similar to the following
Mapping rules are the tools used to build a logical structure to a site from the physical area, perhaps multiple areas, used to house the associated files. The logical organisation of served data is largely hierarchical, organised under the Web-server path root, and is achieved via two mechanisms.
Physically distinct areas are used for good physical reasons (e.g. the area can best be hosted on a task-local disk), for historical reasons (e.g. the area existed before any Web environment existed) or for reasons of convenience (e.g. lets put this where access controls already allow the maintainers to manage it).
There are no good reasons for having site-specific documents integrated into the package directory structure!
All site-served files should be located in an autonomous, dedicated area or areas. The only reason to place script files into WASD_ROOT:[CGI-BIN] or WASD_ROOT:[architecture_BIN] is that the script script is traditionally accessible via a /cgi-bin/ path or that the site is a small and/or low usage environment where this directory is conveniently available for the few extra scripts being made available.
For any significant site (size that as best suits your perception), or for when a specific software system or systems is being built or exists and it is being "Web-ified", design that software system as you would be any other. That is place the documentation in one directory are, executables and support procedures in their own, management files in another, data in yet another area, etc. Then make those portions that are required to be accessible via the Web interface accessible via the logical associations afforded through the use of the server's mapping rules (10. Request Processing Configuration). Of course existing areas that are to be now made available via the Web can be mapped in the same way. This includes the active components - executable scripts. There is no reason (apart from historical) why the /cgi-bin/ path should be used to activate scripts associated with a dedicated software system. Use a specific and unique path for scripts associated with each such system.
When making a directory structure available via the Web care must be taken that only the portions required to be accessed can be. Other areas should or must not be accessible. The server process can only access files that are world-accessible, it is specifically granted access via VMS protection mechanisms (e.g. ACLs), or that the individual SYSUAF-authorized accessor can access and which have specifically been made available via server authorization rules. Use the recommendations in 3.2 Recommended Package Security as guidlines when designing your own site's protections and permissions.
A particular area of the file system may be specified as the root of a particular (virtual) sites documents. This is done using the WASD_CONFIG_MAP SET map=root=<string> mapping rule. After this rule is applied all subsequent rules have the specified string prefixed to mapped strings before file-system resolution.
For example, the following WASD_CONFIG_MAP rule set
when applied to the following request URLs results in the described mappings being applied.
With the request for a directory icon using
And a request for a script using
Care must be taken in getting the sequence of mapping rules correct for access to non-site resources before actually setting the document root which then ties every other resource to that root.
A single WASD server process is capable of concurrently supporting the same host name on different port numbers and a number of different host names (DNS aliased or multi-homed) using the same port number. This capability is generally known as a virtual server. There is no design limitation on the number of these services that WASD will concurrently support. Virtual services offer versatile and powerful multi-site capabilities using the one system and server. Service determination is based on the contents of the request's "Host:" header field. If none is present it defaults to base service for the interface's IP address and port.
If the logical name WASD_CONFIG_SERVICE is defined the deprecated WASD_CONFIG_GLOBAL [Service] directive is not used (see below).
See 7.7 Service Directives for further detail.
Using the [Service] WASD_CONFIG_GLOBAL configuration parameter or the /SERVICE qualifier the server creates an HTTP service for each specified. If the host name is omitted it defaults to the local host name. If the port is omitted it defaults to 80. The first port specified in the service list becomes the "administration" port of the server, using the local host name, appearing in administration reports, menus, etc. This port is also that specified when sending control commands via the /DO= qualifier.
This rather contrived example shows a server configured to provide four services over two host names.
Note that both the WASD_CONFIG_SERVICE configuration file (see 7.7 Service Directives) and the /SERVICE= command-line qualifier override this directive.
The essential profile of a site is established by its mapped resources and any authorization controls, the WASD_CONFIG_MAP and WASD_CONFIG_AUTH configuration files respectively, and these two files support directives that allow configuration rules to be applied to all virtual services (i.e. a default), to a host name (all ports), or to a single specified service (host name and specific port).
To restrict rules to a specified server (virtual or real) add a line containing the server host name, and optionally a port number, between double-square brackets. All following rules will be applied only to that service. If a port number is not present it applies to all ports for that service name, otherwise only to the service using that port. To resume applying rules to all services use a single asterisk instead of a host name. In this way default (all service) and server-specific rules may be interleaved to build a composite environment, server-specific yet with defaults. Note that service-specific and service-common rules may be mixed in any order allowing common rules to be shared. This descriptive example shows a file with one rule per line.
Both the mapping and authorization modules report if rules are provided for services that are not configured for the particular server process (i.e. not in the server's [Service] or /SERVICE parameter list). This provides feedback to the site administrator about any configuration problems that exist, but may also appear if a set of rules are shared between multiple processes on a system or cluster where processes deliver differing services. In this latter case the reports can be considered informational, but should be checked initially and then occasionally for misconfiguration.
If a service is not configured for the particular host address and port of a request one of two actions will be taken.
This applies to dotted-decimal addresses as well as alpha-numeric. Therefore if there is a requirement to connect via a numeric IP address such a service must have been configured.
Note also that the converse is possible. That is, it's possible to configure a service that the server cannot ever possibly respond to because it does not have an interface using the IP address represented by the service host.
WASD can apply GZIP compression (gzip, deflate) to any suitable response body and can accept similarly compressed request bodies. It dynamically maps required functions from a ZLIB shareable image. Originally developed against the ZLIB v1.2.n port by Jean-François Piéronne, the VMS-PORTS (GNV) LIBZ package is also supported.
WASD dynamically maps the associated shareable image by successively accessing the (optionally defined) WASD_LIBZ_SHR32 logical name, then GNV$LIBZSHR32, then LIBZ_SHR32, before reporting GZIP unavailable.
The shareable image must be INSTALLed (without any particular privileges) before it can be activated by the privileged WASD HTTPd image (the WASD startup will automatically do this if necessary). The server process log and the Server Administration page, Statistics Report panel named Environment contains the version activated or a VMS status message if an error was encountered.
The WASD_CONFIG_GLOBAL directive [GzipResponse] controls whether this feature is enabled for the gzip content-encoding of suitable response bodies. This directive requires at least one parameter, the compression level in the range 1..9. Smaller values provide faster but poorer compression ratios while larger values better compression at the cost of more CPU cycles and latency. This corresponds to the GZIP utility's -1..-9 CLI switches. Two optional parameters could allow ZLIB's 'memLevel' and 'windowBits' to be adjusted by ZLIB afficiendos (level[,memory,window]). A small amount of experimentation by this author indicates minor changes in memory usage and compression ratio by fiddling with these.
Be aware that GZIP encoding is memory intensive. From 132kB to 265kB has been observed per compressing request (WATCH provides this in a summary line). These values apply across a wide range of transfer sizes (from kilobytes to tens of megabytes). It also is CPU intensive and adds response latency, though that might be well be offset by significant reductions in transfer time on the Internet or other slower, non-intranet infrastructures. Text content compression has been observed from 30% to 10% of the original file size (even down to 1% in the case of the extremely redundant content of [EXAMPLE]64K.TXT). VMS executables (for want of another binary test case) at around 40%. In other words, GZIP encoding may not be suitable or efficient for every site or every request!
Once enabled WASD will GZIP the responses for all suitable contents provided the client accepts the encoding and the response is not one of the following:
Additional control may be exercised with the following path SETings:
Using path settings GZIP compression may be disabled for specified file types (apart from those already suppressed as described above).
A script using the Script-Control: X-content-encoding-gzip=0 CGI response header can similarly suppress GZIP compression of its output if required. See "Scripting Overview" for further detail.
By default GZIP encoding flushes the internal buffer only when full. Most commonly this is not an issue because of high rates of output. However with slow output sources, such as from some classes of script, this can result in considerable latency before a client sees an initial response, and then between transmission of further output. By default output is initially flushed after 5 seconds and thereafter at a maximum interval of 15 seconds. The WASD_CONFIG_GLOBAL directive [GzipFlushSeconds] allows this period to be adjusted.
Decoding of GZIP content-encoded request bodies is enabled using the WASD_CONFIG_GLOBAL directive [GzipAccept]. Enabling this using a value 15 (or 1) results in the server advertising its acceptance of GZIPed requests using the "Accept-Encoding: gzip, deflate" response header. Requests containing bodies GZIP compressed will have these decoded as they are read from the client and before further processing, such as the upload of files into server accessible file-system space. This decoding is optional and not the default with DCL and DECnet script processing. That is, a request body will be passed to the script still encoded unless specific mapping directs otherwise. Decoding by the server into the original data prior to transfering to the script can be enabled for all or selected scripts using the following path settings:
Note that scripts need to be specially aware of both GZIP encoded bodies and those already decoded by the server. In the first case the stream must be read to the specified content-length and then decoded. In the second case, a content-length cannot be provided by the server (without unencoding the entire stream ahead of time it cannot predict the final size). Where the server is to decode the request body before transfering it to the script it changes the CGI variable CONTENT_LENGTH to a single question-mark ("?"). Scripts may use this to detect the server's intention and then must ignore any transfer-encoding and/or content-encoding header information and read the request body until end-of-file is received.
GZIP decoding (decompression) is understandably much less memory and CPU intensive. Experimentation indicates it does not contribute significantly to latency either.
Request "throttling" is a term adopted to describe controlling the number of requests that can be processing against any specified path at any one time. Requests in excess of this value are First-In-First-Out (FIFO) queued, up to an optional limit, waiting for a currently processing request to conclude allowing the next queued request to resume processing. This is primarily intended to limit concurrent resource-intensive script execution but could be applied to any resource path. Here's one dictionary description.
throttle n 1: a valve that regulates the supply of fuel to the engine [syn: accelerator, throttle valve] 2: a pedal that controls the throttle valve; "he stepped on the gas" [syn: accelerator, accelerator pedal, gas pedal, gas, gun] v 1: place limits on; "restrict the use of this parking lot" [syn: restrict, restrain, trammel, limit, bound, confine] 2: squeeze the throat of; "he tried to strangle his opponent" [syn: strangle, strangulate] 3: reduce the air supply; of carburetors [syn: choke]
This is applied to a path (or paths) using the WASD_CONFIG_MAP mapping SET THROTTLE= rule (10.5.5 SET Rule). The general format is
One way to read a throttle rule is "begin to throttle (queue) requests from the n1 value up to the n2 value, after which the queue is FIFOed up to the n3 value when it resumes queuing-only, up until the busy n4 value".
Each integer represents the number of concurrent requests against the throttle rule path. Parameters not required may be specified as zero or omitted in a comma-separated list. The schema of the rule requires that each successive parameter be larger than that preceding it. This basic consistency check is performed when the rule is loaded.
For any rule the possible maximum number of requests that can be processed at any one time may be simply calculated through the addition of the n1 value to the difference of the n3 and n2 values (i.e. max = n1 + (n3 - n2)). The maximum concurrently queued as the difference of the n4 and the maximum concurrently processed.
A comprehensive throttle statistics report is available from the Server Administration facility.
If the concurrent processing value (n1) has a second, slash-delimited integer, this serves to limit the number of authenticated user-associated requests that can be concurrently processing.
When a request is available for processing the associated remote user name is checked for activity against the queue. The u1 (or per-user throttle value) is a limit on that user name's concurrent processing. If it would exceed the specified value the request is queued until the number of requests processing drops below the u1 value. All other values in the throttle rule are applied as for non-per-user throttling.
If an unauthenticated request is matched against the throttle rule (i.e. there is no authorization rule matching the request path) the client has a 500 (server error) response returned. Obviously per-user throttling must have a remote user name to throttle against and this is a configuration issue.
Requests up to 10 are concurrently processed. When 10 is reached futher requests are queued to server capacity.
Concurrent requests to 10 are processed immediately. From 11 to 20 requests are queued. After 20 all requests are queued but also result in a request FIFOing off the queue to be processed (queue length is static, number being processed increases to server capacity).
Concurrent requests up to 15 are immediately processed. Requests 16 through to 30 are queued, while 31 to 40 requests result in the new requests being queued and waiting requests being FIFOed into processing. Concurrent requests from 41 onwards are again queued, in this scenario to server capacity.
Concurrent requests up to 10 are immediately processed. Requests 11 through to 20 will be queued. Concurrent requests from 21 to 30 are queued too, but at the same time waiting requests are FIFOed from the queue (resulting in 10 (n1) + 10 (n3-n2) = 20 being processed). From 31 onwards requests are just queued. Up to 40 concurrent requests may be against the path before all new requests are immediately returned with a 503 "busy" status. With this scenario no more than 20 can be concurrently processed with 20 concurrently queued.
Concurrent requests up to 10 are processed. When 10 is reached requests are queued up to request 30. When request 31 arrives it is immediately given a 503 "busy" status.
This is basically the same as scenario 4) but with a resume-on-timeout of two minutes. If there are currently 15 (or 22 or 28) requests (n1 exceeded, n3 still within limit) the queued requests will begin processing on timeout. Should there be 32 processing (n3 has reached limit) the request will continue to sit in the queue. The timeout would not be reset.
This is basically the same as scenario 3) but with a busy-on-timeout of three minutes. When the timeout expires the request is immediately dequeued with a 503 "busy" status.
Concurrent requests up to 10 are processed. The requests must be of authenticated users. Each authenticated user is allowed to execute at most one concurrent request against this path. When 10 is reached, or if less than 10 users are currently executing requests, then further requests are queued to server capacity.
This is basically the same as scenario 8) but with a busy-on-timeout of three minutes. When the timeout expires any requests still queued against the user name is immediately dequeued with a 503 "busy" status.
Throttling is applied using mapping rules. The set of these rules may be changed within an executing server using map reload functionality. This means the number of, and/or contents of, throttle rules may change during server execution. The throttle functionality needs to be independent of the the mapping functionality (requests are processed independently of mapping rules once the rules have been applied). After a mapping reload the contents of the throttle data structures may be at variance with the constraints currently executing requests began processing under.
This should have little deleterious effect. The worst case is mis-applied constraints on the execution limits of changed request paths, and slightly confusing data in the Throttle Report. This quickly passes as requests being processed under the previous throttle constraints conclude and an entirely new collection of requests created using the constraints of the currently loaded rules are processed.
The "client_connect_gt:" mapping conditional (5. Conditional Configuration) attempts to allow some measurement of the number of requests a particular client currently has being processed. Using this decision criterion appropriate request mapping for controlling the additional requests can be undertaken. It is not intended to provide fine-grained control over activities, rather just to prevent a single client using an unreasonable proportion of the resources.
For example. If the number of requests from one particulat client looks like it has got out of control (at the client end) then it becomes possible to queue (throttle) or reject further requests. In WASD_CONFIG_MAP
While not completely foolproof it does offer some measure of control over gross client concurrency abuse or error.
HTTP uses an implementation of the MIME (Multi-purpose Internet Mail Extensions) specification for identifying the type of data returned in a response. A MIME content-type consists of a plain text string describing the data as a type and slash-separated subtype, as illustrated in the following examples:
In common with most HTTP servers WASD uses a file's suffix (extension, type, e.g. .HTML, .TXT, .GIF) to identify the data type within the file. The [AddType] directive is used during configuration to bind a file type to a MIME content-type. To make the server recognise and return specific content-types these directives map file types to content-types.
With the VMS file system there is no effective file characteristic or algorithm for identifying a file's content without an exhaustive examination of the data contained there-in … a very expensive process (and probably still inconclusive in many cases), hence the reliance on the file type.
Mappings using [AddType] look like these.
To allow the server to share content-type definitions with other MIME-aware applications, and for WASD scripts to be able to perform their own mapping on a shared understanding of MIME content it is possible to move the file suffix to content-type mapping from a collection of [AddType]s in WASD_CONFIG_GLOBAL to an external file. This file is usually named MIME.TYPES and is specified in WASD_CONFIG_GLOBAL using the [AddMimeTypesFile] directive.
Mappings using MIME.TYPES look like these.
A leading content-type is mapped to single or multiple file suffixes. A general MIME.TYPES file commonly has content-types listed with no corresponding file suffix. These are ignored by WASD. Where a file suffix is repeated during configuration the latter version completely supercedes the former (with the Server Administration page showing an italicised and struck-through content-type to help identify duplicates).
To allow the configuration information used by the server to generate directory listings with additional detail, WASD-specific extensions to the standard MIME.TYPES format are provided. These are "hidden" in comment structures so as not to interfere with non-WASD application use. All begin with a hash then an exclamation character ("#!") then another reserved character indicating the purpose of the extension. Existing comments are unaffected provided the second character is anything but an exclamation mark!
These directives are placed following the MIME-type entry they apply to. An example of the contents of a MIME.TYPES file with various WASD extensions.
Other reserved characters have been specified for development purposes but are not (perhaps currently) employed by the HTTP server.
If a file type is not recognised (i.e. no [AddType] or [AddMimeTypesFile] mapping corresponding to the file type) then by default WASD identifies its data as application/octet-stream (i.e. essentially binary data). Most browsers respond to this content-type with a download dialog, allowing the data to be saved as a file. Most commonly these unknown types manifest themselves when authors use "interesting" file names to indicate their purpose. Here are some examples the author has encountered:
If the site administrator would prefer another default content-type, perhaps "text/plain" so that any unidentified files default to plain text, then this may be configured by specifying that content-type as the description of the catch-all file type entry. Examples (use one of):
When accessing files it is possible to explicitly specify the identifying content-type to be returned to the browser in the HTTP response header. Of course this does not change the actual content of the file, just the header content-type! This is primarily provided to allow access to plain-text documents that have obscure, non-"standard" or non-configured file extensions.
It could also be used for other purposes, "forcing" the browser to accept a particular file as a particular content-type. This can be useful if the extension is not configured (as mentioned above) or in the case where the file contains data of a known content-type but with an extension conflicting with an already configured extension specifying data of a different content-type.
Enter the file path into the browser's URL specification field ("Location:", "Address:"). Then, for plain-text, append the following query string:
For another content-type substitute it appropriately. For example, to retrieve a text file in binary (why I can't imagine :-) use
This is an example:
It is posssible to "force" the content-type for all files in a particular directory. Enter the path to the directory and then add
(or what-ever type is desired). Links to files in the listing will contain the appropriate "?httpd=content&type=..." appended as a query string.
This is an example:
Language-specific variants of a document may be configured to be served automatically and transparently. This is organized as a basic file and name with language-specific variant indicated by an additional "tag", one of ISO language abbreviations used by the "Accept-Language:" request header field, e.g. en for English, fr for French, de for German, ru for Russian, etc.
Two variants of the basic file specification are possible; file name (the default) and file type. Hence if the basic file name is EXAMPLE.HTML then specifically German, English, French and Russian language versions in the directory would be either
A path must be explicitly SET using the accept=lang mapping rule as containing language variants. As searching for variants is a relatively expensive operation the rule(s) applying this functionality should be carefully crafted. The accept=lang rule accepts an optional default language representing the contents of the basic, untagged files. This provides an opportunity to more efficiently handle requests with a language first preference matching that of the default. In this case no variant search is undertaken, the basic file is simply served. The following example sets a path to contain files with a default language of French and possibly containing other language variants.
In this case the behaviour would be as follows. With the default language set to "fr" a request's "Accept-Language:" field is initially processed to check if the first preference is for "fr". If it is then there is no need for further accept language processing and the basic file is returned as the response. If not then the directory is searched for other files matching the EXAMPLE_*.HTML specification. All files matching this wildcard have the "*" portion (e.g. "EN", "FR", "DE", "RU") added to a list of variants. When the search is complete this list is compared to the request's "Accept-Language:" list. The first one to be matched has the contents of the corresponding file returned. If none are matched the default version would be returned.
This example of the behaviour is based on the contents of the directory described above. A request that specifies
will have EXAMPLE.HTML returned (without having searched for any other variants). For a request specifying
then the EXAMPLE_RU.HTML file is returned, and if no "Accept-Language:" is supplied with the request EXAMPLE.HTML would be returned. One or other file is always returned, with the default, non-language file always the fallback source of data. If it does not exist and no other language variant is selected the request returns a 404 file-not-found error.
When using the accept=lang=(variant=type) form of the rule (i.e. the variant is placed on the file type rather than the default file name) each possible file extension must also must have its content-type made known to the server. Using the example above the variants would need to be configured in a similar way to the following.
Normally only files with a content-type of "text/.." are subject to variant searching. If the rule path includes a file type then those files matching the rule are also variant-searched. In this way images, audio files, etc., may also have language-specific versions supplied transparently. The following illustrates this usage
The default character set sent in the response header for text documents (plain and HTML) is set using the [CharsetDefault] directive and/or the SET charset mapping rule. English language sites should specify ISO-8859-1, other Latin alphabet sites, ISO-8859-2, 3, etc. Cyrillic sites might wish to specify ISO-8859-5 or KOI8-R, and so on.
Document and CGI script output may be dynamically converted from one character set to another using the standard VMS NCS conversion library. The [CharsetConvert] directive provides the server with character set aliases (those that are for all requirements the same) and which NCS conversion function may be used to convert one character set into another.
When this directive is configured the server compares each text response's character set (if any) to each of the directive's document charset string. If it matches it then compares each of the accepted charset (if multiple) to the request "Accept-Charset:" list of accepted characters sets.
At least one doc-charset and one accept-charset must be present. If only these two are present (i.e. no NCS-conversion-function) it indicates that the two character sets are aliases (i.e. the same set of characters, different name) and no conversion is necessary.
If an NCS-conversion-function is supplied it indicates that the document doc-charset can be converted to the request "Accept-Charset:" preference of the accept-charset using the NCS conversion function name specified.
A factor parameter can be appended to the conversion function. Some conversion functions require more than one output byte to represent one input byte for some characters. The 'factor' is an integer between 1 and 4 indicating how much more buffer space may be required for the converted string. It works by allocating that many times more output buffer space than is occupied by the input buffer. If not specified it defaults to 1, or an output buffer the same size as the input buffer.
Multiple comma-separated accept-charsets may be included as the second component for either of the above behaviours, with each being matched individually. Wildcard * (asterisk) and % (percentage) may be used in the doc-charset and accept-charset strings.
By default the server provides its own internal error reporting facility. These reports may be configured as basic or detailed on a per-path basis, as well as determining the basic "look-and-feel". For more demanding requirements the [ErrorReportPath] configuration directive allows a redirection path to be specified for error reporting, permitting the site administrator to tailor both the nature and format of the information provided. A Server Side Include document, CGI script or even standard HTML file(s) may be specified. Generally an SSI document would be recommended for the simplicity yet versatility.
Internally generated error reports are the most efficient. These can be delivered with two levels of error information. The default is more detailed.
ERROR 404 - The requested resource could not be found.
Document not found ... /wasd_root/index.html
(document, bookmark, or reference requires revision)
Additional information: 1xx, 2xx, 3xx, 4xx, 5xx, help
WASD/10.0.0 server at www.example.com port 80
There is also the more basic.
ERROR 404 - The requested resource could not be found.
Additional information: 1xx, 2xx, 3xx, 4xx, 5xx, help
WASD/10.0.0 server at www.example.com port 80
These can be set per-server using the [ReportBasicOnly] configuration directive, or on a per-path basis in the WASD_CONFIG_MAP configuration file. The basic report is intended for environments where traditionally a minimum of information might be provided to the user community, both to reduce site configuration information leakage but also where a general user population may only need or want the information that a document was either found or not found. The detailed report often provides far more specific information as to the nature of the event and so may be more appropriate to a more technical group of users. Either way it is relatively simple to provide one as the default and the other for specific audiences. Note that the detailed report also includes in page <META> information the code module and line references for reported errors.
To default to a basic report for all but selected resource paths introduce the following to the top of the WASD_CONFIG_MAP configuration file.
To provide the converse, default to a detailed report for all but selected paths use the following.
The additional reference information included in the report may be disabled using the appropriate WASD_CONFIG_MSG [status] message item. Emptying this message results in an error report similar to the following.
ERROR 404 - The requested resource could not be found.
WASD/10.0.0 server at www.example.com port 80
The server signature may be disabled using the WASD_CONFIG_GLOBAL [ServerSignature] configuration directive. This results in a minimal error report.
ERROR 404 - The requested resource could not be found.
A simple approach to providing a site-specific "look-and-feel" to server reports is to customize the [ServerReportBodyTag] WASD_CONFIG_GLOBAL configuration directive. Using this directive report page background colour, background image, text and link colours, etc., may be specified for all reports. It is also possible to more significantly change the report format and contents (within some constraints), without resorting to the site-specific mechansims refered to below, by changing the contents of the appropriate WASD_CONFIG_MSG [status] item. This should be undertaken with care.
Customized error reports can be generated for all or selected HTTP status status associated with errors reported by the server using the WASD_CONFIG_GLOBAL [ErrorReportPath] and WASD_CONFIG_SERVER [ServiceErrorReportPath] configuration directives. To explicitly handle all error reports specify the path to the error reporting mechanism (see description below) as in the following example.
To handle only selected error reports add the HTTP status codes following the report path. In this example only 403 and 404 errors are explicitly handled, the rest remain server-generated. This is particularly useful for static error documents.
To exclude selected error reports (and handle all others by default) add the HTTP status codes preceded by a hyphen following the report path. In this example 401 and 500 errors are server-generated.
Site-specific error reporting works by internal redirection. When an error is reported the original request is concluded and the request reconstructed using the error report path before internally being reprocessed. For SSI and CGI script handlers error information becomes available via a specially-built query string, and from that as CGI variables in the error report context. One implication is the original request path and query string are no longer available. All error information must be obtained from the error information in the new query string.
It is suggested with any use of this facility the reporting document(s) be located somewhere local, probably WASD_ROOT:[RUNTIME.HTTPD], and then enabled by placing the appropriate path into the [ErrorReportPath] configuration directive.
Note that virtual services can subsequently have this path mapped to other documents (or even scripts) so that some or all services may have custom error reports. For instance the following arrangement provides each host (service) with an customized error report.
Static HTML documents are a good choice for site-specific error messages. They are very low overhead and are easily customizable. One per possible response error status code is required. When providing an error report path including a "!UL" introduces the response status code into the file path, providing a report path that includes a three digit number representing the HTTP status code. A file for each possible or configured code must then be provided, in this example for 403 (authorization failure), 404 (resource not found) and 502 (bad gateway/script).
This mapping will generate paths such as the following, and require the three specified to respond to those errors.
SSI documents provide the versatility of dynamic report generation for but they do take time and CPU for processing, and this may be a significant consideration on busy sites.
Three example SSI error report documents are provided.
The following SSI variables are available specifically for generating error reports. The <!--#printenv --> statement near the top of the file may be uncommented to view all SSI and CGI variables available.
Variable | Description |
---|---|
ERROR_LINE | The HTTPd source code line from where the error was generated. |
ERROR_MODULE | The HTTPd source code module corresponding to the line described above. |
ERROR_REPORT | A single HTML string providing a detailed error message. |
ERROR_REPORT2 | A single HTML comment providing more detailed VMS error information if available |
ERROR_REPORT3 | A server-generated HTML string providing a brief explanation of the error if available |
ERROR_STATUS_CLASS | Essentially the single hundreds digit from the status code (e.g. 4). |
ERROR_STATUS_CODE | The HTTP response status code representing the error (e.g. 404). |
ERROR_STATUS_EXPLANATION | The HTTP response status code descriptive meaning (e.g. "The requested resource could not be found.") |
ERROR_STATUS_TEXT | The HTTP response status code abbreviated meaning (e.g. "Not Found"). |
ERROR_STATUS_TYPE | "basic" or "detailed". |
ERROR_STATUS_URI | The HTML-escaped URI of the request reporting the error. |
FORM_ERROR_… | A series of CGI variables providing the sources for the above SSI variables, as well as other general environment information. |
It is also possible to report using a script. The same error information is available via corresponding CGI variables. The source code WASD_ROOT:[SRC.MISC]REPORTERROR.C provides such an implementation example.
Significant server events may be optionally displayed via a selected operator's console and recorded in the operator log. Various categories of these events may be selectively enabled via WASD_CONFIG_GLOBAL directives (6. Global Configuration).
Some significant server events are always logged to OPCOM if any one of the above categories is enabled.
WASD provides a versatile access log, allowing data to be collected in Web-standard common and combined formats, as well as allowing customization of the log record format. It is also possible to specify a log period. If this is done log files are automatically changed according to the period specified.
Where multiple access log files are generated with per-instance, per-period and/or per-service logging (see below) these can be merged into single files for administrative or archival purposes using the CALOGS utility.
The Quick-and-Dirty LOG STATisticS utility can be used to provide elementary ad hoc log analysis from the command-line or CGI interface.
Exclude requests from specified hosts using the [LogExcludeHosts] configuration parameter, or using the "SET NOLOG" mapping directive.
The configuration parameter [LogFormat] and the server qualifier /FORMAT specifies one of three pre-defined formats, or a user-definable format. Most log analysis tools can process the three pre-defined formats. There is a small performance impost when using the user-defined format, as the log entry must be specially formatted for each request.
The user-defined format allows customised log formats to be specified using a selection of commonly required data. The specification must begin with a character that is used as a substitute when a particular field is empty (use "0" for no substitute, as in the "windows log format" example below).
Two different "escape" characters introduce the following parameters:
Characters | Description |
---|---|
AR | authentication realm (if any) |
AU | authenticated user name (if any) |
BB | bytes in body (excludes response header) |
BQ | quadword bytes in response (includes header) |
BY | bytes in response (includes header) |
CA | client address |
CC | X509 client certificate authorization distinguishing name |
CI | SSL session cipher (e.g. "AES128-SHA", "AES256-SHA256") |
CL | value provided by "Content-Length:" header (cf. "PL") |
CN | client host name (or address if DNS lookup disabled) |
CP | client port |
DI | specified dictionary value |
ID | session track ID - obsolete |
EM | request elapsed time in milliseconds |
ES | request elapsed time in fractional seconds |
ME | request method |
NP | specified notepad value |
PA | request path (not to be confused with "RQ") |
PL | actual body (payload) length received with POST or PUT (cf. "CL") |
PR | request URL (includes protocol scheme) |
QS | request query string (if any) |
RF | referer (if any) |
RQ | complete request string (see below) |
RP | request protocol |
RS | response status code |
SN | server host name |
SC | script name (if any) |
SM | request scheme (http: or https:) |
SP | server port |
SR | SSL session reused |
SV | SSL protocol (e.g. "SSLv3", "TLSv1") |
TC | request time (common log format) |
TI | request time (local in ISO 8601 extended format) |
TS | request time (UTC in ISO 8601 basic format) sortable |
TU | request time (UTC) |
TV | request time (VMS format) |
UA | user agent |
VS | virtual service (service host:port) |
XX | custom, usually site/client-specific, logging item see module [SRC.HTTPD]LOGGING.C functions LoggingCustom..() |
Character | Description |
---|---|
0 | a null character (used to define the empty field character) |
! | insert an "!" |
^ | insert a "^" |
n | insert a newline |
q | insert a quote (so that in DCL the quotes won't need escaping!) |
t | insert a TAB |
Any other character is directly inserted into the log entry.
It is possible to use one of the pre-defined log format keywords with additional user-defined directive appended. The appended directives must include ALL additional literal characters and directives required in the log entry. The syntax is <pre-defined keyword>+<appended format> as in "COMMON+ !EM".
The access log file may have a period specified against it, producing an automatic generation of log file based on that period. This allows logs to be systematically named, ordered and kept to a managable size. This is also known as log rotation. The period specified can be one of
The log file changes on the first request after the entering of the new period.
When using a periodic log file, the file name specified by WASD_CONFIG_LOG or the configuration parameter [LogFile] is partially ignored, only partially because the directory component of it is used to located the generated file name. The periodic log file name generated comprises
as in the following example
For the daily period the date represents the request date. For the weekly period it is the date of the previous (or current) day specified. That is, if the request occurs on the Wednesday for a weekly period specified by Monday the log date show the last Monday's. For the monthly period it uses the first.
By default a single access log file is created for each HTTP server process. Using the [LogPerService] configuration directive a log file for each service provided by the HTTPd is generated (2.3 Virtual Services). The [LogNaming] format can be any of "NAME" (default) which names the log file using the first period-delimited component of the IP host name, "HOST" (which uses as much of the IP host name as can be accomodated within the maximum 39 character filename limitation under ODS-2), or "ADDRESS" which uses the full IP host address in the name. Both HOST and ADDRESS have hyphens substituted for periods in the string. If these are specified then by default the service port follows the host name component. This may be suppressed using the [LogPerServiceHostOnly] directive, allowing a minimum extra 3 characters in the name, and combining entries for all ports associated with the host name (for example, a standard HTTP service on port 80 and an SSL service on port 443 would have entries in the one file).
To reduce physical disk activity, and thereby significantly improve performance, the RMS characteristics of the logging stream are set to buffer records for as long as possible and only write to disk when buffer space is exhausted (a periodic flush ensures records from times of low activity are written to disk). However when multiple server processes (either in the case of multiple instances on a single node, single instance on each of multiple clustered nodes, or a combination of the two) have the same log files open for write then this buffering and defered write-to-disk is disabled by RMS, it insisting that all records must be flushed to disk for correct serialization and coherency.
This introduces measurable latency and a potentially significant bottleneck to high-demand processing. Note that it only becomes a real issue under load. Sites with a low load should not experience any impact.
Sites that may be affected by this issue can revert to the original buffered log stream by enabling the [LogPerInstance] configuration directive. This ensures that each log stream has only one writer by creating a unique log file for each instance process executing on the node and/or cluster. It does this by appending the node and process name to the file type. This would change the log name from something like
Of course the number-of and naming-of log files is beginning to become a little itimidating at this stage! To assist with managing this seeming plethora of access log files is the calogs utility, which allows multiple log files to be merged whilst keeping the records in timestamp order.
When per-period or per-service logging is enabled the access log file has a specific name generated. Part of this name is the host's name or IP address. By default the host name is used, however if the host IP address is specified the literal address is used, hyphens being substituted for the periods. Accepted values for the [LogNaming] configuration directive are:
Examples of generated per-service (non-per-period) log names:
Examples of generated per-period (with/without per-service) log names:
Examples of generated per-instance (per-service and per-period) log names:
Access tracking has been obsoleted with WASD v11.0.
It is possible to mark a path as being of specific interest. When this is accessed by a request the server puts a message into the the server process log and perhaps of greater immediate utility the increase in alert hits is detected by HTTPDMON and this (optionally) provides an audible alert allowing immediate attention. This is enabled on a per-path basis using the SET mapping rule. Variations on the basic rule allow some control over when the alert is generated.
The special case ALERT=integer allows a path to be alerted if the final response HTTP status is the same as the integer specified (e.g. 501, 404) or within the category specified (599, 499).
↩︎ | ↖︎ | ↑︎ | ↘︎ | ↪︎ |
This section does not pretend to be a complete guide to keeping the "bad guys" out. It does provide a short guide to making a site more-or-less liberal in the way the server supplies information about the site and itself. The reader is also strongly recommended to a number of hard copy and Web based resources on this topic.
The WASD package had its genesis in making the VMS operating system and associated resources, in a development environment, available via Web technology. For this reason configurations can be made fairly liberal, providing information of use in a technical environment, but that may be superfluous or less-than-desirable in other, possibly commercial environments. For instance, directory listings can contain VMS file system META information, error reports can be generated with similar references along with reporting source code module and line information.
The example configuration files contain a fairly restrictive set of directives. When relaxing these recommendations keep in mind that the more information available about the underlying structure of the site the more potential for subversion. Do not enable functionality that contributes nothing to the fundamental usefulness of the site, or that has the real potential to compromise any given site. This section refers to configuration directives discussed in more detail in later chapters.
It is established wisdom that the only secure computing system is one with no users and no access, that system security is inversely proportional to system usability, and that making something idiot-proof results in only idiots using it. So there are some trade-offs but …
WASD_ROOT:[WASDOC.MISC]WASD_ADVISORY_020925.TXT
https://www.cvedetails.com/cve/CVE-2002-1825
This research has resulted in these server flaws being closed and package security considerations being extensively reviewed. As a result WASD v8.1 was much more resistent to such penetration than previous releases (and slightly less easy to use, but that's one of those trade-offs). My assessment would be that if Gailly did not find it then it wasn't there to find!
Of course any given site's security is a function of the underlying package's security profile, with the site's implementation of that, AND other considerations such as local authorization and script implementations. Pay particular and ongoing attention to site security and integrity.
This is the merest of mentions for a topic that literally encompasses volumes!
Each site is very-much an individual combination of configurations and applications. Each site therefore has specific potential vulnerabilities that should be known about and addressed where possible. Especially if you have an Internet-facing site then this mean you!
Many tools exist at the time of writing that didn't fifteen years before when WASD was investigated as described above. Some are on-line, "free" site health checks and penetration testing. Others are tools that can (often) be used from your platform of choice, many of which are free and open-source (FOSS). We are spoiled for choice.
In WASD's earlier years tools such as Apache Bench, WASD Bench, along with batched cURL and wget requests were used to exercise and, in some limited fashion, fuzz the server (providing invalid, unexpected, or random request data) in an effort to discover flaws in server code and execution.
Currently the WASD development bench uses the OWASP ZAP tool to provide a much more comprehensive exercise and test environment.
ZAP is cross-platform (Linux, macOS, Windows, other), GUI-based, Java-implemented, and may be used effectively, though certainly not to its full capabilities, after fifteen minutes with the introductory documentation. ZAP is a highly recommended tool for site vulnerability assessment.
ZAP is used to exercise the in-development WASD, in particular the following aspects (not in any particular order).
It should be noted that these are provided "out-of-the-box", is a subset of that out-of-the-box functionality of particular interest in WASD development, and utilise only a tiny percentage of ZAP total capabilities.
At the time of writing, OWASP ZAP does not support the HTTP/2 protocol. The solution for exercising WASD is to use the nghttpx proxy utility.
It can be configured to accept HTTP and HTTPS connections at the front end (ZAP) and convert HTTP/1.1 requests to HTTP/2 requests at the back end (WASD). This introduces a proxy like this:
The ZAP and nghttpx can be run on the same or independent systems.
On a suitable platform (Linux, macOS, MS Windows – not ported to VMS) use this at the command-line.
Where 0.0.0.0 is any address on the nghttpx platform and port the IP port on that platform ZAP will connect to. The WASD-server is the host name or address of the WASD system with port the usual 443. The workers integer is the number of threads used on the platform, with the maximum number of HTTP/2 back end connections maintained to the WASD system. The number of concurrent requests is determined by ZAP concurrency.
For example:
The following table provides recommended file protection settings for package top-level directories. Subdirectories share their parents' settings. The package tree is owned by the SYSTEM account. Directories with world READ access have no ACLs. Other directories, not accessible to the world, but sometimes having other degress of access to one or more accounts always have rights identifiers (see below) and associated ACLs to control directory access, and to propagate required access to files created beneath them. The server selectively enables SYSPRV to provide access to some of these areas (e.g. for log creation).
Some pre-v8.1 directories are not included in this table. These are not significant in versions from 8.1 onwards and may be deleted. They can continue to exist however and the security procedures described below ensure that they comply to the general post-8.1 security model. The file access permissions indicated below are for directory contents. The directory files themselves have settings appropriate for content access.
Directory | Access World | Access Other | Description |
---|---|---|---|
[AXP-BIN] | none | script:RE | Alpha executable script files |
[AXP] | none | none | Alpha build and utility area |
[CGI-BIN] | none | script:RE | architecture-neutral script files |
[EXAMPLE] | read | (world) | package examples |
[EXERCISE] | read | (world) | package test files |
[HTTP$NOBODY] | none | script:RWED | scripting account default home area |
[HTTP$SERVER] | none | server:RWED | server account default home area |
[IA64-BIN] | none | script:RE | Itanium executable script files |
[IA64] | none | none | Itanium build and utility area |
[INSTALL] | read | (world) | installation, update and secuity procedures |
[LOCAL] | none | none | site configuration files |
[LOG] | none | none | site access logs |
[LOG_SERVER] | none | server:RWED | server process (SYS$OUTPUT) logs |
[RUNTIME] | read | (world) | graphics, help files, etc. |
[SCRATCH] | none | script:RWED | working file space for scripts |
[SCRIPT] | none | none | example architecture-neutral scripts |
[SRC] | none | (world) | package source files |
[STARTUP] | none | server:RE | package startup procedures |
[VAX-BIN] | none | script:RE | VAX executable script files |
[VAX] | none | none | VAX build and utility area |
[WASDOC] | read | (world) | package documentation |
It is recommended site-specific directories have settings applied appropriate to their function in comparison to similar package directories. See below for tools to assist in this.
Three rights identifiers provide selective access control to the directory tree. Identifiers were used to allow maximum flexibility for a site in allowing required accounts access to either execute the server or execute scripts. Non-default account names only need to be granted one of these identifiers to be provided with that role's access. Installation, update and/or security utilities create and maintain these identifiers appropriately.
Identifier | Description |
---|---|
WASD_HTTP_SERVER | Indicates the default server account. |
WASD_HTTP_NOBODY | Indicates the default scripting account. |
WASD_IGNORE_THIS | Looked for by the SECHAN utility to avoid it changing security on site-specific files. |
These rights identifiers are applied to directories and files to provide the required level of access. The following example shows the security setting of the top-level CGI-BIN.DIR and one of it content files.
As noted above, WASD version 8.1 and later is much more conservative in what it makes generally available from the package tree, and a site administrator now has to take extraordinary measures to open up certain sections, making it a much more difficult and deliberate action. The package installation, update and security procedures and their associated utilities should always be used to ensure that the installed package continues to conform to the security baseline.
Package security may be "refreshed" or reapplied at any time, and this should be done periodically to ensure that an installed package has not inadvertantly been opened to access where it shouldn't have. Of course this is not a guarantee that any given site is secure. Site security is a function of many factors; package vulnerabilities, site configuration, deployed scripts, cracker determination and expertise, etc., etc. What refreshing the security baseline does is provide a known secure (and WASD-community scrutinized) starting point. It should be used as part of a well considered site security maintenance program.
The following DCL procedure resets the package security baseline.
It guides the administrator through a number of stages
of which each one may be declined. After all of these steps it searches for and executes if found the DCL procedure WASD_ROOT:[INSTALL]SECURE.COM. The intent of this file is to allow a site to automatically update any site-specific security settings (and of course modify any set by the main procedure).
The SECHAN utility (pronounced "session") is used by SECURE.COM and the associated procedures to make file system security settings. It is also available for direct use by the site administrator.
One of the more useful functions of SECHAN is applied using the /IGNORE qualifier.
This ACE can be removed from a file (leaving other entries of any ACL intact) using the /NOIGNORE qualifier. This returns the file(s) subject again to the SECHAN utility.
Other functionality may prove useful when applied to local parts of the package or web structure.
This section provides only a basic description. More detail may be found in the prologue to the source code.
Not only does it make it easier to manage site content but is also good security practice to keep server package and site content completely separate (2.2 Site Organisation).
This can also be applied to scripts, both source and build areas. Keep your business logic out of the package source tree and potentially prying eyes. The script executables themselves can be placed into the package scripting directories but should be built independently from these and copied using locally maintained DCL procedures from build into scripting areas (the WASD_ROOT:[INSTALL]SECURE.COM procedures described above may be useful here).
Various configuration and mapping directives can be used to make the site environment more or less liberal in the information it implicitly can provide.
Published guidelines for securing a Web site generally advise against automatic directory listing generation. Where a home page is not available this may leak information on other directory contents, provide parent and child directory access, etc. Compounding this is the WASD facility to force a listing by providing a directory URL with file wildcards (not to decry the usefulness in some environments).
The mapping rule "SET DIR=keyword" can be used to change this on a per-path basis (10.5.5 SET Rule).
Conservative recommendation: Set "[DirAccess] selective" allowing listing for directories containing a file named ".WWW_BROWSABLE", disable [DirMetaInfo] and [DirWildcard].
Reports are pages generated by the server, usually to indicate an error or other non-success condition, but sometimes to indicate success (e.g. after a successful file upload). Reports provide either basic or detailed information about the situation. Sometimes the detailed information includes VMS file system details, system status codes etc. To limit this information to a minimum indication adjust the following directives.
The mapping rule "SET REPORT=keyword" can be used to change some of these on a per-path basis (10.5.5 SET Rule).
Conservative recommendation: Provide minimal error information by enabling [ReportBasicOnly] and disabling [ReportMetaInfo]. Enable [ServerSignature] to provide a slightly more friendly report (server software can easily be obtained from the response header anyway).
If a static site is all that's required this source of compromise can simply be avoided.
Conservative recommendation: Only deploy scripts your site will actually be using. Remove all the files associated with any other scripts. Do not allow obsolete script environments to remain active. Be proactive.
Also see ‘Securing Scripting’ in 3.5.4 Server Side Includes.
SSI documents are pages containing special markup directives interpreted by the server and replaced with dynamic content. This can include detail about the server, the file or files making up the document, and can even include DCL commands and procedure activation for supplying content into the page. All this by anyone who can author on the site.
The mapping rule "SET SSI=keyword" can be used to change some of this on a per-path basis (10.5.5 SET Rule).
Conservative recommendation: Disable [SsiExec].
Scripting has been a notorious source of server compromise, particularly within Unix environments where script process shell command-line issues require special attention. The WASD CGI scripting interface does not pass any arguments on the command line, and is careful not to allow substitution when constructing the CGI environment. Nevertheless, script behaviours cannot be guaranteed and care should be exercised in their deployment (ask me!)
It is strongly recommended to execute scripts in an account distinct from that executing the server. This should also mean that the accounts are not members of the same group nor should it be a member of any other group. This minimises the risk of both unintentional and malicious interference with server operation through either Inter-Process Communication (IPC) or scripts manipulating files used by the server. The PERSONA facility can be used to further differentiate script activities. See "Scripting Overview" for further detail.
The default WASD installation creates two such accounts, with distinct UICs, usernames and home directory space. Nothing should be assumed or read into the scripting account username - it's just a username.
Username | Description |
---|---|
HTTP$SERVER | Server Account |
HTTP$NOBODY | Scripting Account |
During startup the server checks for the existence of the default scripting account and automatically configures itself to use this for scripting. If it is not present it falls-back to using the server account. Other account names can be used if the startup procedures are modified accordingly. The default scripting username may be overridden using the /SCRIPT=AS=<username> qualifier (also see the "Scripting Overview").
Authorization issues imply controlling access to various resources and actions and therefore require careful planning and implementation if compromise is to be avoided. WASD has a quite capable and versatile authorization and authentication environment, with a significant number of considerations.
WASD authorization cannot be enabled without the administrator configuring at least three resources, and so therefore cannot easily be "accidentally" activated. One of these is the addition of a startup qualifier controlling where authentication information may be sourced. Another the server configuration file. The third, mapping paths against authorization configuration.
For sites that may be particularly sensitive about inadvertant access to some resources it is possible to use the authorization configuration file as a type of cross-check on the mapping configuration file. The server /AUTHORIZATION=ALL startup qualifier forces all access to be authorized (even if some are marked "none"). This means that if something "escapes" via the mapping file it will very likely be "caught" by an absence in the authorization file.
Although it is of limited usefulness because server identity may be deduced from behaviour and other indicators the exact server and version may be obscured by using the otherwise undocumented /SOFTWARE= qualifier to change the server identification string to (basically) whatever the administrator desires. This identification is included as part of all HTTP response headers.
Historically and by default server configuration and authorization sources are contained within the server package tree. There is no reason why they cannot be located anywhere the site prefers. Generally all that is required is a change to logical name definition and server startup.
Version 8.1 and later is much more conservative in what it makes available of the package tree via the server. The package installation, update and security procedures and their associated utilities should always be used to ensure that the installed package continues to conform to the security baseline. See 3.3 Maintaining Package Security.
Furthermore, with many sites there may be little need to access the full, or any of the WASD package tree. A combination of mapping and/or authorization rules can relatively simply block or control access to it. These examples can be easily tailored to suit a site's specific requirements.
This example shows blocking all access to the /wasd_root/ tree, except for documentation, source code, examples and exercise (performance results) areas.
The next example forbids all access to the package tree unless authorized (the authorization detail would vary according to the site). It also allows modify access for the Server Administration page and to the /wasd_root/local/ area.
The following example shows how this might occur.
Authorization rules can be used to effectively block access to any VMS file specification (it cannot be done during mapping because the translation from path to file system is not performed until mapping is complete).
or to selectively allow access
This is not a treatise on Web security and the author is not a security specialist. This is some general advice based on observation. There is little one can do at the server itself to reduce a concerted attack against a site. Common objectives of such attacks include the following (not an exhaustive list).
Where a general attack is launched directed against a specific platform (a combination of operating system and Web server software). Often these can be due to wide-spread infection of systems, meaning many attacks are being launched from a large number of systems (often without the system owners' knowlege or cooperation).
WASD, and OpenVMS in particular, are generally immune to such attacks because they are not Microsoft or Unix based. The impact of the attack becomes one of the nuisance-value traffic as the site is probed by the (sometimes very large number of) source systems.
Where a specific attack is made against a site in an attempt to exploit a known vulnerability associated with that platform or environment.
These are perhaps the most worrying, although the security-by-obscurity element works in favour of WASD and OpenVMS in this case. Neither are as common as other platforms and therefore do not receive as much attention.
(DOS) Usually comprise flooding a site with requests in an effort to consume all available network or server resources making it unavailable for legitimate use.
These can be insidious, flooding network equipment as well as systems. Attempts at control are best undertaken at the periphery of the network (routers) although concerted attacks can succeed against the best prepared network.
Where a systematic attempt to break into one or more accounts is undertaken. These are often repeated, dictionary-based password-guessing attacks.
WASD's authentication functionality notes successive password validation failures and after a reasonable number disables all access via the username for a constantly extended period. Passwords stop being checked and so a dictionary-based attack cannot succeed. Password validation failures can be recorded via OPCOM.
Knowing of or searching for resources that should be controlled by authorization but are not.
WASD's /AUTHORIZATION=ALL functionality may assist here (‘Securing Authorisation’ in 3.6 Scripting).
There are a few strategies for reducing the load on a server experiencing a generalized attack or probing. These can also be used to "discourage" the source from considering the site an easy target. Unfortunately most require request acceptance and at least some processing before taking action. The general idea is to identify either the source site or some characteristic of the request that indicates it could not possibly be legitimate. Most platform-specific attacks have such a signature. For instance attacks against Microsoft platforms often involve probes for backdoors into non-server executables. These can be identified by the path containing strings such as "/winnt/", "/system32/", "/cmd.exe" or variations on them. This style will be used in examples below.
Content Security Policy (CSP) is an added layer of security that helps to detect and mitigate certain types of attacks, including Cross Site Scripting (XSS) and data injection attacks.
https://en.wikipedia.org/wiki/Content_Security_Policy
https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP
WASD provides CSP support using mapping rules. See 10.5.5 SET Rule. WASD allows configuration of policy using the set response=csp=policy rule and reporting only of policy violations using set response=cspro=policy. WASD includes a (basic) violation reporting utility. See CSPreport[er] in WASD Features and Facilities.
4.1Wildcard Patterns |
4.2Regular Expressions |
4.3Examples |
4.4Expression Substitution |
↩︎ | ↖︎ | ↑︎ | ↘︎ | ↪︎ |
Matching of strings is a pervasive and important function within the server. Two types are supported; wildcard and regular expression. Wildcard matching is generally much less expensive (in CPU cycles and time) than regular expression matching and so should always be used unless the match explicitly requires otherwise. WASD attempts to improve the efficiency of both by performing a preliminary pass to make simple matches and eliminate obvious mismatches using a very low-cost comparison. This either matches or doesn't, or encounters a pattern matching meta-character which causes it to undertake full pattern matching.
To assist with the refinement of string matching patterns the Server Administration facility has a report item named "Match". This report allows the input of target and match strings and allows direct access to the server's wildcard and regular expression matching routines. Successful matches show the matching elements and a substitution field (4.4 Expression Substitution) allows resultant strings to be assessed.
To determine what string match processing is occuring during request processing in the running server use the match item available from the Server Administration WATCH Report.
Wildcard patterns are simple, low-cost mechanisms for matching a string to a template. They are designed to be used in path and authorization mapping to compare a request path to the root (left-hand side) or a template expression.
Expression | Purpose |
---|---|
* | Match zero or more characters (non-greedy) |
** | Match zero or more characters (greedy) |
% | Match any one character |
Wildcard matching uses the '*' and '%' symbols to match any zero or more, or any one character respectively. The '*' wildcard can either be greedy or non-greedy depending on the context (and for historical reasons). It can also be forced to be greedy by using two consecutive ('**'). By default it is not greedy when matching request paths for mapping or authentication, and is greedy at other times (matching strings within conditional testing, etc.)
Non-greedy matching attempts to match an asterisk wildcard up until the first character that is not the same as the character immediately following the wildcard. It matches a minimum number of characters before failing. Greedy matching attempts to match all characters up until the first string that does not match what follows the asterisk.
To illustrate; using the following string
Regular expression matching is case insensitive (in line with other WASD behaviour) and uses the POSIX EGREP pattern syntax and capabilities. Regular expression matching offers significant but relatively expensive functionality. One of those expenses is expression compilation. WASD attempts to eliminate this by pre-compiling expressions during server startup whenever feasable. Regular expression matching must be enabled using the [RegEx] WASD_CONFIG_GLOBAL directive and are then differentiated from wildcard patterns by using a leading "^" character.
A detailed tutorial on regular expression capabilities and usage is well beyond the scope of this document. Many such hard-copy and on-line documents are available.
http://en.wikipedia.org/wiki/Regular_expression
This summary is only to serve as a quick mnemonic. WASD regular expressions support the following set of operators.
Description | Usage |
---|---|
Match-self Operator | Ordinary characters. |
Match-any-character Operator | . |
Concatenation Operator | Juxtaposition. |
Repetition Operators | * + ? {} |
Alternation Operator | | |
List Operators | [...] [^...] |
Grouping Operators | (...) |
Back-reference Operator | ^digit |
Anchoring Operators | ^ $ |
Backslash Operator | Escape meta-character; i.e. ^ ^ . $ | [ ( |
The following operators are used to match one, or in conjunction with the repetition operators more, characters of the target string. These single and leading characters are reserved meta-characters and must be escaped using a leading backslash ("^") if required as a literal character in the matching pattern. Note that this does not apply to the range hyphen; to include a hyphen in a range ensure the character is the first or last in the range.
Expression | Purpose |
---|---|
^ | Match the beginning of the line |
. | Match any character |
$ | Match the end of the line |
| | Alternation (or) |
[abc] | Match only a, b or c |
[^abc] | Match anything except a, b and c |
[a-z0-9] | Match any character in the range a to z or 0 to 9 |
Repetition operators control the extent, or number, of whatever the matching operators match. These are also reserved meta-characters and must be escaped using a leading backslash if required as a literal character.
Expression | Function |
---|---|
* | Match 0 or more times |
+ | Match 1 or more times |
? | Match 1 or zero times |
{n} | Match exactly n times |
{n,} | Match at least n times |
{n,m} | Match at least n but not more than m times |
The following provides a series of examples as they might occur in use for server configuration.
Expression substitution is available during path mapping (10. Request Processing Configuration). Both wildcard (implicitly) and regular expressions (using grouping operators) note the offsets of matched portions of the strings. These are then used for wildcard and specified wildcard substitution where result strings provide for this (e.g. mapping 'pass' and 'redirect' rules). A maximum of nine such wildcard substitutions are supported (one other, the zeroeth, is the full match).
With wildcard matching each asterisk wildcard contained in the pattern (template string) has matching characters in the target string noted and stored. Note that for the percentage (single character) wildcard no such storage is provided. These characters are available for substitution using corresponding wildcards present in the result string. For instance, the target string
With regular expression matching the groups of matching characters must be explicitly specified using the grouping parenthesis operator. Hence with regular expression matching it is possible to match many characters from the target string without retaining them for later substitution. Only if that match is designated as a subsitution source do the matching characters become available for substituion via any result string. Using two possible target strings as an example
By default the strings matched by wildcard or grouping operators are substituted in the same order in which they are matched. This order may be changed by specifying which wildcard string should be substituted where. Not all matched (and stored) strings need to be substituted. Some may be omitted and the contents effectively ignored.
The specified substitution syntax is a result wildcard followed by a single-apostrophe (') and a single digit from zero to nine (0…9). The zeroeth element is the full matching string. Element one is the first matching part of the expression, on through to the last. Specifying an element that had no matching string substitutes an empty string (i.e. nothing is added). Using the same target string as in the previous previous example
↩︎ | ↖︎ | ↑︎ | ↘︎ | ↪︎ |
Request processing (WASD_CONFIG_MAP) and authorization (WASD_CONFIG_AUTH) rules may be conditionally applied depending on request, server or other charactersistics. These include
As described in 2.3.1 [[virtual-server]] a [[host:port]] rule applies subsequent configuration depending on whether the request service matches the specified service. This makes it a fundamental element of conditional configuration.
Note that service conditionals impose a boundary on the scope of if..endif constructs. That is, an if..endif may not span a virtual service conditional. A conditional flow syntax error is reported if an if..endif construct is not properly closed before encountering a subsequent [[host:port]] rule.
These may be nested up to a maximum depth of eight, are not case sensitive and generally match via string comparison, although some tests are performed as boolean operations, by converting the conditional parameter to a number before comparison, and IP address parameters will accept a network mask as well as a string pattern.
The basis of much conditional decision making is string pattern matching. Both wildcard and regular expression based pattern matching is available (4. String Matching). Wildcard matching in conditional tests is greedy. Regular expression matching, in common with usage throughout WASD, is differentiated from wildcard patterns using a leading "^" character.
Conditional expressions and processing flow structures may be used in the following formats. Conditional and rule text may be indented for clarifying structure.
Logical operators are also supported, in conjunction with precedence ordering parentheses, allowing moderately complex compound expressions to be applied in conditionals.
! | logical negation |
---|---|
&& | logical AND |
|| | logical OR |
There are two more conditional structures that allow previous decisions to be reused. These are unif and the ifif. The first unconditionally includes rules regardless of the current state of execution. The second resumes execution only if the previous if or elif expression was true. The else statement may also be used after an unif to continue only if the previous expression was false. The purpose of these constructs are to allow a single decision statement to include both conditional and unconditional rules.
Although the server cannot determine the correct intent of an otherwise syntactically correct conditional, if it encounters an unexpected but detectable condition during processing it aborts the request, supplying an appropriate error message.
Flow control errors (e.g. an if not closed by a subsequent endif) abort all rule processing and provide a fatal error report to the client.
The following keywords provide a match between the corresponding request or other value and a string immediately following the delimiting colon. White space or other reserved characters may not be included unless preceded by a backslash. The actual value being used in the conditional matching may be observed using the mapping item of the WATCH facility.
Keyword | Description |
---|---|
accept: | Browser-accepted content types as listed in the "Accept:" request header field. Same string as provided in CGI variable HTTP_ACCEPT. |
accept-charset: | Browser-accepted character sets as listed in the "Accept-Charset:" request header field. CGI variable HTTP_ACCEPT_CHARSET. |
accept-encoding: | Browser-accepted content encoding as listed in the "Accept-Encoding:" request header field. CGI variable HTTP_ACCEPT_ENCODING. |
accept-language: | Browser language preferences as listed in the "Accept-Language:" request header field. CGI variable HTTP_ACCEPT_LANGUAGE. |
authorization: | The raw authorization string from the request header, if any supplied. This could be simply used to test whether it has been supplied or not. |
callout: | Simple boolean value. If a script callout is in progress (see "Scripting Overview, CGI Callouts".) it is true, otherwise false. |
client_connect_gt: | An integer representing the current network connections (those currently being processed plus those currently being "kept alive") for the particular client represented by the current request. If greater than this value returns true, otherwise false. See 2.6 Client Concurrency. |
cluster_member: | If the supplied node name is (perhaps currently) a member of the cluster (if any) the server may be executing on. |
command_line: | The command line qualifiers and parameters used when the server image was activated. |
cookie: | Raw cookie data as the text string provided in "Cookie:" request header field. CGI variable HTTP_COOKIE. |
decnet: | Whether DECnet is active on the system and which version is available. This value will be 0 if not active, 4 if PhaseIV or 5 is PhaseV. |
dict: | Matches the specified dictionary entry. See 5.5.4 WATCH Dictionary. |
directory: | Tests whether the specified directory exists or not. Parameter can be a URI available for mapping by the server or a VMS file-system specification. If no parameter is supplied the request path is mapped to a file-system specification. As this conditional accesses the file-system it can be relatively expensive in terms of server latency. |
document_root: | The DOCUMENT_ROOT CGI variable SET using the map=root=<string> mapping rule. |
file: | Tests whether the specified file exists or not. Parameter can be a URI available for mapping by the server or a VMS file-system specification. If no parameter is supplied the request path is mapped to a file-system specification. The specification can be a directory. As this conditional accesses the file-system it can be relatively expensive in terms of server latency. |
forwarded: | Proxy/gateway host(s) request forwarded by, as specified in request header field "Forwarded:". CGI variable HTTP_FORWARDED. |
host: | The host (and optionally port) specified in request header "Host:" field. This is used by all modern browsers to provide virtual host information to the server. CGI variable HTTP_HOST. |
http2: | Is true if the request is being transported using HTTP/2 |
instance: | Used to check whether a particular, clustered instance of WASD is available. See 5.3.4 Instance: and Robin: Keywords. |
jpi_username: | The account username the server is executing as. |
mapped_path: | The path resulting from mapping (phase 2 if script path involved) from which the path-translated is derived. |
multihome: | Somewhat specialised conditional that becomes non-null when a client used a different IP address to connect to the service than the is bound to. Is set to the IP address the client used and may be matched using wildcard matching or as a network mask. |
note: | Ad hoc information (string) provided by the server administrator using the /DO=NOTE= facility (and online equivalent) that can be used to quickly and easily modify rule processing on a per-system or per-cluster basis. |
notepad: | Information (strings) stored using the SET notepad= mapping rule. See 5.3.1 Notepad: Keyword. |
ods: | Specified as 2 or 5 (Extended File System), or as SRI file name encoding (MultiNet NFS and others) PWK encoding (PATHWORKS 4/5), ADS encoding (Advanced Server / PATHWORKS 6), SMB encoding (Samba - same as ADS). |
pass: | A numeric value, 1 or 2, representing the first or second pass (if a script component was parsed) through the path mapping rules. Will be zero at other times. When the server is reverse-mapping a file specification will be -1. |
path-info: | Path specified in the request line. CGI variable PATH_INFO. |
path-translated: | VMS translation of path-info. Available after rule mapping (i.e. during authorization rule processing). |
proctor: | Simple boolean value. If a proctored script this is true (see Script Proctor in WASD Scripting). |
query-string: | Query string specified in request line. Same information as provided in CGI variable QUERY_STRING. |
rand: | Value from a random number generator. See 5.3.2 Rand: Keyword. |
redirected: | If a request has been internally redirected (10.5.2 REDIRECT Rule) this conditional will be non-zero. Can be used as a boolean or with a digit specified. |
referer: | URL of refering page as provided in "Referer:" request header field. CGI variable HTTP_REFERER. |
regex: | Simple boolean value. If configuration directive [RegEx] is enabled (and hence regular expression string matching, 4. String Matching) this will be true. |
remote-addr: | Client IP address. Same as provided as CGI variable REMOTE_ADDR. As with all IP addresses used for conditional testing this may be wildcard string match or network mask expressed as address/mask-length (see 5.3.7 Host Addresses). A domain (host) name preceded by a question point may be specified (e.g. "?the.host.name"). The corresponding IP address is then looked up and compared to the client. This allows ad hoc host name based rules and is distinct from use of remote-host. Note that DNS lookup can introduce some latency to rule (and request) processing. |
remote-host: | Client host name if name resolution enabled, otherwise the IP address (same as remote-addr). CGI variable REMOTE_HOST. |
request: | Detect the presence of specific or unknown request fields. See 5.3.3 Request: Keyword. |
request-method: | HTTP method ("GET", "POST", etc.) specified in the request line. CGI variable REQUEST_METHOD. |
request-protocol: | Detect the HTTP protocol in use for the request, as "2", "1.1", "1.0" or "0.9". Note that the server-protocol conditional will indicate 1.1 when the request-protocol indicates 2. The server and its applications (scripts) still treat it semantically as HTTP/1.1. |
request-scheme: | Request protocol as "http:" or "https:". CGI variable REQUEST_SCHEME. |
request-uri: | The unescaped request path plus any query-string. CGI variable REQUEST_URI. |
restart: | A numeric value, zero to maximum, representing the number of times path mapping has been SET map=restart. Can be used as a boolean or with a digit specified. |
robin: | Used to check whether a particular, clustered instance of WASD is available and distribute requests to it using a round-robin algorithm. See 5.3.4 Instance: and Robin: Keywords. |
script-name: | After the first pass of rule mapping (script component resolution), or during authorization processing, any script component of the request URI. |
server-addr: | The service IP address. CGI variable SERVER_ADDR. This may be wildcard string match or network mask expressed as address/mask-length. |
server_connect_gt: | An integer representing the current server network connections (those currently being processed plus those currently being "kept alive"). If greater than this value returns true, otherwise false. |
server_process_gt: | An integer representing the current server requests in-progress. If greater than this value returns true, otherwise false. |
server-name: | The (possibly virtual) server name. This may or may not exactly match any string provided via the host keyword. CGI variable SERVER_NAME. |
server-port: | The (possibly virtual) server port number. CGI variable SERVER_PORT. |
server-protocol: | "1.1", "1.0", "0.9" representing the HTTP protocol used by the request. |
server-software: | The server identification string, including the version. For example "HTTPd-WASD/8.0.0 OpenVMS/AXP SSL". CGI variable SERVER_SOFTWARE. |
service: | This is the composite server name plus port as server-name:port. To match an unknown service use "?". |
ssl: | Simple boolean value. If request is via Secure Sockets Layer then this will be true. |
syi_arch_name: | System information; CPU architecture of the server system, "Alpha", "Itanium" or "VAX". |
syi_hw_name: | System information; hardware identification string, for example "AlphaStation 400 4/233". |
syi_nodename: | System information; the node name, for example "KLAATU". |
syi_version: | System information; VMS version string, for example "V7.3". |
tcpip: | A string derived from the UCX$IPC_SHR shareable image. It looks something like this "Compaq TCPIP$IPC_SHR V5.1-15 (11-JAN-2001 02:28:33.95)" and comprises the agent (Compaq, MultiNet, TCPware, unknown), the name of the image, the version and finally the link date. |
time: | Compare to current system time. See 5.3.5 Time: Keyword. |
trnlnm: | Translate a logical name. See 5.3.6 Trnlnm: Keyword. |
upstream-addr: | Client proxy/accelerator IP address, when "SET CLIENT=keyword" has been applied to enable transparent up-stream proxy. Same as provided as CGI variable UPSTREAM_ADDR. As with all IP addresses used for conditional testing this may be wildcard string match or network mask expressed as address/mask-length (see 5.3.7 Host Addresses). |
user-agent: | Browser identification string as provided in "User-Agent:" request header field. CGI variable HTTP_USER_AGENT. |
webdav: | Simple boolean value. If the request has been identified as WebDAV then this is true. Takes an optional parameter, "MSagent", which is true if a Microsoft WebDAV agent has been detected. |
websocket: | Simple boolean value. If a WebSocket protocol upgrade request will be true. |
x-forwarded-for: | Proxied client name or address as provided in "X-Forwarded-For:" request header field. CGI variable HTTP_X_FORWARDED_FOR. |
The request notepad is a string storage area that can be used to store and retrieve ad hoc information during path mapping and subsequent authorization processing. The notepad contents can be changed using the SET notepad=<string> or appended to using SET notepad=+<string> (10.5.5 SET Rule). These contents then can be subsequently detected using the notepad: conditional keyword (or the obsolescent 'NO' mapping conditional) and used to control subsequent mapping or authorization processing.
Notepad information persists across internal redirection processing (10.5.2 REDIRECT Rule) and so may be used when the regenerated request is mapped and authorized. To prevent such information from unexpectedly interfering with internally redirected requests a notepad="" can be used to empty the storage area.
The dictionary facility provides similar and arguably superior functionailtiy. See 5.5.4 WATCH Dictionary. In fact notepad is now implemented as a dictionary entry.
At the commencement of each pass a new pseudo-random number is generated (and therefore remains constant during that pass). The rand: conditional is intended to allow some sort of distribution to be built into a set of rules, where each pass (request) generates a different one. The random conditional accepts two parameters, a modulas number, which is used to modulas the base number, and a comparison number, which is compared to the modulas result.
Hence the following conditional rules
Looks through each of the lines of the request header for the specified request field and/or value. This may be used to detect the presence of specific or unknown (to the server) request fields. When detecting a specified just field the name can be provided
Note that all request fields known to the server have a specific associated conditional keyword (i.e. "user-agent:" for the above example). To determine whether any request fields unknown to the server have been supplied use the request: keyword as in the following example.
Both of these conditionals are designed to allow the redistribution of requests between clustered WASD services. They are WASD-aware and so allow a slightly more tailored distribution than perhaps an IP package round-robin implementation might. Each tests for the current operation of WASD on a particular node (using the DLM) before allowing the selection of that node as a target. This can allow some systems to be shutting down or starting up, or have WASD shutdown for any reason, without requiring any extraordinary procedures to allow for the change in processing environment.
The instance: directive allows testing for a particular cluster member having a WASD instance currently running. This can allow requests to be redirected or reverse-proxied to a particular system with the knowlege that it should be processed (of course there is a small window of uncertainty as events such as system shutdown and startup occur asynchronously). The behaviour of the conditional block is entirely determinate based on which node names have a WASD instance and the order of evaluation. Compare this to a similar construct using the robin: directive, as described below.
This conditional is deployed in two phases. In the first, it contains a comma-separated list of node names (that are expected to have instances of WASD instantiated). In the second, containing a single node name, allowing the selected node to be tested. For example.
If none of the node names specified in the first phase is currently running a WASD instance the rule returns false, otherwise true. If true the above example has conditional block processed with each of the node names successively tested. If NODE1 has a WASD instance executing it returns true and the associated redirect is performed. The same for NODE2 and NODE3. At least one of these would be expected to test true otherwise the outer conditional established during phase one would have been expected to return false.
The robin: conditional allows rules to be applied sequentially against specified members of a cluster that currently have instances of WASD running. This is obviously intended to allow a form of load sharing and/or with redundancy (not balancing, as no evaluation of the selected target's current workload is performed, see below). As with the instance: directive above, there is, of course, a small window of potential uncertainty as events such as system shutdown and startup occur asynchronously and may impact availability between the phase one test and ultimate request distribution.
This conditional is again used in two phases. The first, containing a comma-separated list of node names (that are expected to have instances of WASD instantiated). The second, containing a single node name, allowing the selected node (from phase one) to have a rule applied. For example.
In this case round-robining will be made through four node names. Of course these do not have to represent all the systems in the cluster currently available or having WASD instantiated. The first time the 'robin:' rule containing multiple names is called VAX1 will be selected. The second time ALPHA1, the third ALPHA2, and the fourth IA64A. With the fifth call VAX1 is returned to, the sixth ALPHA1, etc. In addition, the selected nodename is verified to have a instance of WASD currently running (using the DLM and WASD's instance awareness). If it does not, round-robining is applied again until one is found (if none is available the phase one conditional returns false). This is most significant as it ensures that the selected node should be able to respond to a redirected or (reverse-)proxied requested. This is the selection set-up phase.
Then there is the selection application phase. Inside the set-up conditional other conditionals apply the selection made in the first phase (through simple nodename string comparison). The rule, in the above example a redirect, is applied if that was the node selected.
During selection set-up unequal weighting can be applied to the round-robin algorithm by including particular node names more than once.
In the above example, the node ALPHA will be selected twice as often as either of VAX1 and VAX2 (and because of the ordering interleaved with the VAX selections).
The time: conditional allows server behaviour to change according to the time of day, week, or even year. It compares the supplied parameter to the current system time in one of three ways.
The trnlnm: conditional dynamically translates a logical name and uses the value. One mandatory and up to two optional parameters may be supplied.
The logical-name must be supplied; without it false is always returned. If just the logical-name is supplied the conditional returns true if the name exists or false if it does not. The default name-table is LNM$FILE_DEV. When the optional name-table is supplied the lookup is confined to that table. If the optional string-to-match is supplied it is matched against the value of the logical and the result returned.
Host names or addresses can be an alpha-numeric string (if DNS lookup is enabled) or dotted-decimal network address, a slash, then a dotted-decimal mask. For example "131.185.250.0/255.255.255.192". This has a 6 bit subnet. It operates by bitwise-ANDing the client host address with the mask, bitwise-ANDing the network address supplied with the mask, then comparing the two results for equality. Using the above example the host 131.185.250.250 would be accepted, but 131.185.250.50 would be rejected. Equivalent notation for this rule would be "131.185.250.0/26".
The following provides a collection of examples of conditional mapping and authorization rules illustrating the use of wildcard matching, network mask matching and the various formats in which the rules may be blocked.
Of course there are a multitude of possibilities based on this idea!
The per-request dictionary stores key-value string pairs related to request processing. Some entries are generated and used internally by the server and others may be inserted, value changed, removed and tested by the server admin for conditional processing purposes.
The dictionary was initially introduced as an abstraction layer between the significantly different HTTP/2 and HTTP/1.n header semantics and server internal processing. Its utility was then extended into configuration. It is implemented as a standard hash table with collision lists. The small cost in terms of processing is completely offset by its effectiveness.
Dictionary entries may be configured using the SET dict=key=value mapping rule or the DICT key=value meta keyword. These are known as configuration entries. Keys must begin with an alpha-numeric character but otherwise keys and values may contain any printable character, with some needing to be escaped in the text of configuration files. These are some examples of each.
If an existing key is (re-)inserted it overwrites the old value.
An entry can have an empty value.
An entry may be removed from the dictionary by prefixing the key name with an exclamation point.
All configuration entries may be removed by using the exclamation point with an empty key.
As mentioned, the server generates and uses dictionary entries during request processing. There are multiple types of entry, generally insulated from each other for good reason. These entries are also available for conditional testing.
Character | Type | Description |
---|---|---|
~ | configuration | admin managed entry |
$ | internal | server processing |
> | request | request header field |
< | response | response header field |
The "if (dict:expression)" contruct first checks for a configuration entry, then for an request header field entry, then finally for an internal entry (response entries are only available for testing after response processing begins and so not in the search list). It is also possible to test for a key of a specific type by prefixing the key name with the type character. This example shows a request header field being conditionally processed.
It is also possible to set an entry of a specific type by prefixing the key with the type character. For example the following will set a response header field that will be included in the header when returned to the client.
Setting any non-configuration entry should only be undertaken by the literati or the brave.
The value of a dictionary entry can be derived in whole or part from the value of another entry or entries. This uses a somewhat familiar substitution syntax. A contrived example shows an entry being set that transfers back the request user-agent header field as a response header field.
The content of a request's dictionary at significant stages of request processing can be viewed using the [x]Internal item of a WATCH report. See WATCH Facility of WASD Features and Facilities.
A request dictionary WATCH point is similar to the following (end of request processing) example. Note that all of the entry types described above are present in the example, including two configured entries. Note also that two of the internal entries contain embedded line-breaks and empty lines. This is an HTTP/2 request and the expanded (HTTP/1.n style) request_header and response_header entries are due to WATCH items Request [x]Header and Response [x]Header also being checked. They were not required for request processing.
The first three digit number is simply the entry count in order of insertion. The second, either square bracketed or period delimited, is the hash table entry. The square brackets indicate the head of the hash table, the periods down the collision list. The single punctuation character is use to indicate and differentiate the entry type. Then are the key and equate-separated value. The brace enclosed numbers are the length of the key and value respectively.
6.1Functional Groupings |
6.2Alphabetic Listing |
↩︎ | ↖︎ | ↑︎ | ↘︎ | ↪︎ |
The example configuration file can be used as a template.
By default, the logical name WASD_CONFIG_GLOBAL locates a global configuration file. Simple editing of the configuration file changes the rules. Alternatively the Server Administration page configuration interface may be used. Changes to the global configuration file require a server restart to put them into effect.
The [IncludeFile] is a directive common to all WASD configuration, allowing a separate file to be included as a part of the current configuration. See 2.1 Include File Directive.
Some directives take a single parameter, such as an integer, string or boolean value. Other directives can/must have multiple parameters. The version 4 configuration requires the directive to be placed on a line by itself and each separate parameter on a separate line following it. All parameter lines apply to the most recently encountered directive.
Note that all boolean directives are disabled (OFF) by default. This is done so that there can be no confusion about what is enabled and disabled by default. To use directive controlled facility it must be explicitly enabled.
Directives requiring periods (timeouts, lifetimes, etc.) can be specified as a single integer (representing seconds, minutes, hours, etc., depending on the directive) or unambiguously using any one of minutes:seconds, hours:minutes:seconds or days-hours:minutes:seconds.
Changes to the global configuration file can be validated at the command-line before restart. This detects and reports any syntactical and fatal configuration errors but of course cannot check the intent of the rules.
One or more (comma-separated if on the same line) internet host/domain names, with "*" wildcarding for host/subdomain matching, to be explicitly allowed access. If DNS lookup is not enabled hosts must be expressed using literal addresses (see [DNSLookup] directive). Also see the [Reject] directive. Reject directives have precedence over Accept directives. The Accept directive may be used multiple times.
Specifies the number of days to record activity statistics, available in report form from the Server Administration facility. Zero disables this data collection. The maximum is 28 days. 11520 bytes per day, and 80640 per week, is required to store the per-minute data.
Specifies a directory listing icon and alternative text for the mime content type specified in the template.
Specifies a directory listing icon for these non-content-type parts of the listing.
Add the content-types of a (de facto) standard MIME.TYPES file to the already configured [AddType] content-types. This binds a file suffix (extension, type) to a MIME content-type. Any specification in this file will supercede any previously defined via [AddType]. A MIME.TYPES file looks something like
The WASD server uses a number of extensions to provide additional information. See 2.7 Content-Type Configuration.
Binds a file suffix (extension, type) to a mime content type. The script name is used to auto-script against a specified file type. Use a hyphen as a place-holder and to indicate no auto-script. The description is used as documentation for directory listings.
The content-type string may include a specific character set. In this way non-default sets (which is usually ISO-8859-1) can be specified for any particular site or any particular file type. Enclose the content-type string with double-quotation marks.
To provide additional information for correct handling of FTP transfers the transfer mode can be indicated after the content type using the FTP: keyword. One of three characters is used. An "A" indicates that this file type should be FTP transfered in ASCII mode. An "I" or a "B" indicates that this file type should be FTP transfered in Image (binary) mode.
To specify a VMS record format for POST or PUT files use the RFM: keyword following the content-type. This record format will always be used when creating the file. The precedence for determining the created file record format is [AddType] RFM:, then any per-path PUT=RFM= mapping rule, then [PutBinaryRFM], then a default of UDF.
Enables or disables BASIC username authentication.
Maximum concurrent authentication cache entries. This needs to be sized adequately to prevent the cache from thrashing (too many attempted entries causing each to spend very little time in the cache before being replaced, only to need to be inserted again with the next attempted access).
Maximum size of an authentication cache entry. The only reason where this may need to be increased is where a site is using the /PROFILE functionality and one or more accounts have a particularly large number of rights identifiers.
The number of minutes authentication information is cached before being revalidated from the authentication source. Zero disables caching (with a resultant impact on performance as each request requiring authentication is validated directly from the source).
Enables or disables Digest username authentication.
The number of seconds a digest nonce for a GET request (read) can be used before becoming stale.
The number of seconds a digest nonce for a PUT (/POST/DELETE ... write) request can be used before becoming stale.
The number of unsuccessful attempts at authentication before the username is disabled. Once disabled any subsequent attempt is automatically refused without further reference to the authentication source. A disabled username can be reenabled by simply purging the cache. Parallels the purpose of SYSGEN parameter LGI_BRK_LIM.
The period during which [AuthFailureLimit] is applied. Parallels the purpose of SYSGEN parameter LGI_BRK_TMO.
The period during which which any intrusion aversion is applied. Parallels the purpose of SYSGEN parameter LGI_HID_TIM.
The number of minutes between authenticated requests that user authentication remains valid before the user is forced to reenter the authentication information (via browser dialog). Zero disables the requirement for revalidation.
If a SYSUAF authenticated password has expired (password lifetime has been reached) accept it anyway (in much the same way network logins are accepted in similar circumstances). This is very different to account expiry, after which authentication is always rejected.
When SYSUAF authentication is performed account access restrictions are checked. By default NETWORK restrictions are used but this global configuration parameter allows another to be specified.
If a SYSUAF authenticated password is/has expired the request is redirected to this URL to change the password.
Obsolete for WASD V9.3 and following.
The number of bytes allocated to script SYS$OUTPUT mailbox capacity. The [BufferSizeDclOutput] sets the maximum record size and [BufferQuotaDclOutput] the total number of bytes that can be outstanding at any given time.
The number of bytes allocated to store and process a script CGI response header.
The number of bytes (and hence BYTLM quota) permanently allocated to each scripting process CGIPLUSIN mailbox.
The number of bytes (and hence BYTLM quota) permanently allocated to each scripting process SYS$COMMAND mailbox.
The number of bytes (and hence BYTLM quota) permanently allocated to each scripting process SYS$OUTPUT mailbox.
The maximum bytes to be allocated to a buffer when transfering file content. For larger files this can improve both the reading of the file content from disk and when appropriately tuned to the local system the transmission of that content to the client, significantly increasing data rates. Limited to the $QIO maximum I/O unit of 65,535 bytes. Bigger is not always necessarily better (in the sense it always improves data rates).
This more esoteric directive attempts to minimise network buffer transmission wastage by rounding the output buffer size up to the network interface MTU (maximum transmission unit). This can provide small improvements to transmission efficiency. For example a filled buffer of 4096 with an MTU of 1500 sends two 1500 byte packets and then one of 1096 bytes, theoretically wasting some 404 bytes. A potentially better choice of buffer size would be 4500. Setting this directive to 1500 would result in the server automatically rounding a [BufferSizeNetWrite] value (for example) from 4096 up to 4500.
The number of bytes allocated to the network read buffer (used for request header, POST body, etc.). Also the number of bytes (and hence BYTLM quota) permanently allocated to each scripting process SYS$INPUT mailbox (allowing a script to read a request body).
Number of bytes allocated to the network write buffer. This buffer is used as the basic unit when transfering file contents (from cache or the file system), as an output buffer during SSI pocessing, directory listing, etc. During many activities multiple outputs are buffered into this storage before being written to the network.
File cache control.
Granularity of memory blocks allocated to file data, in kilobytes.
Maximum number of files loaded into the cache before entries are reused removing the original contents from the cache.
Maximum size of a file before it is not a candidate for being cached, in kilobytes.
Minimum, total number of hits an entry must sustain before being a candidate for [CacheFrequentPeriod] assessment.
If a file has been hit at least [CacheFrequentHits] times in total and the last was within the period here specified it will not be a candidate for reuse. See 9. Cache Configuration.
During this period subsequent reloads (no-cache) requests will not result in the entry being revalidated or reloaded. This can guard period can help prevent unnecessary file system activity.
Obsolete for WASD V8.0 and following.
Maximum memory allocated to the cache, in kilobytes.
The interval after which a cache entry's original, content revision time is revalidated against the file's current revision time. If not the same the contents are declared invalid and reloaded.
Document and CGI script output can be dynamically converted from one character set to another using the standard VMS NCS conversion library. This directive provides the server with character set aliases (those that are for all requirements the same) and which NCS conversion function may be used to convert one character set into another. The general format is
When this directive is configured the server compares each text response's character set (if any) to each of the directive's document charset string. If it matches it then compares each of the accepted charset (if multiple) to the request "Accept-Charset:" list of accepted characters sets. If the same is is either accepted as-is or if a conversion function specified converted by NCS as the document is transfered.
The default character set sent in the response header for text documents (plain and HTML). English language sites should specify ISO-8859-1, other Latin alphabet sites, ISO-8859-2, 3, etc. Cyrillic sites might wish to specify ISO-8859-5 or KOI8-R, and so on.
A script must output a full HTTP or CGI-compliant response. If a plain-text stream is output an error is reported (being the more common behaviour for servers). Errors in output can be disagnosed using the WATCH facility.
The maximum number of concurrent client connections before a "server too busy right now ... try again shortly" error is returned to the client.
Period a script is allowed to continue processing before being terminated after a client prematurely disconnects. An approptiate setting allows most scripts to conclude elegantly and be available for further use. This improves scripting efficiency significantly. Setting this period to zero terminates scripts (and their associated processes) immediately a client is detected as having disconnected.
Whenever the last scripting process is removed from the system, or this number of minutes maximum (whichever occurs first), scan the WASD_SCRATCH directory (if logical defined and it exists) deleting all files that are older than [DclCleanupScratchMinutesOld] minutes. Setting to zero disables WASD_SCRATCH scans.
When performing a [DclCleanupScratchMinutesMax] scan delete files that are older than this value (or the value specified by [DclCleanupScratchMinutesMax], whichever is the larger).
If non-zero the CGIplus process is terminated the specified period after it last processed a request (idle for that period). Adjusting the period to suit the site allows frequently used persistent scripts and scripting engines to remain resident while more sporadically accessed ones do not remain unecessarily. If this value is zero (or unconfigured) the idle timeout is one hour.
By default scripts are executed within server processes. When enabled this instructs the server to create detached processes. This side-steps the issues of having pooled process quotas and also allows non-server-account scripting and in particular "Scripting Overview, Introduction".
When detached scripting processes are created it is possible to assign them base priorities lower that the server itself. This directive takes one or two (comma-separated) integers that determine how many priorities lower than the server scripting processes are created. The first integer determines server processes. A second, if supplied, determines user scripts. User scripts may never be a higher priority that server scripts.
When enabled, non-SSL, process script CGI environments have a CGI variable WWW_GATEWAY_BG created containing the device name (BGnnnn:) of the TCP/IP socket connected to the client. This socket may be accessed by the script for transmission of data directly to the script bypassing the server entirely. This is obviously much more efficient for certain classes of script. For purposes of accurate logging the server does need to be informed of the quantity of data transfered using a CGI callout. See "Scripting Environment" document.
The maximum number of DCL/CGI script processing processes that may ever exist concurrently (works in conjunction with [DclSoftLimit].
Script proctoring proactively creates and maintains specific persistent
scripts and scripting environments (RTEs). It is intended for those
environments that have some significant startup latency.
See WASD Web Services - Scripting
for further information.
One or more file type (extension) specification and scripting verb pairs. See "Scripting Overview, Runtime".
The number of DCL/CGI script processing processes after which idle processes are deleted to make room for new ones. The [DclHardLimit] should be approximately 25% more than the [DclSoftLimit]. The margin exists to allow for occasional slow run-down of deleted/finishing processes. If these limits are not set (i.e. zero) they are calculated with [ProcessMax] using "[DclSoftLimit] = [ProcessMax]" and "[DclHardLimit] = [DclSoftLimit] + [DclSoftLimit] / 4".
By default, when a DCL/scripting subprocess is spawned it inherits the server's currently enabled privileges, which are none, not even TMPMBX or NETMBX. If this parameter is enabled the subprocess is created with the server account's SYSUAF-authorized privileges (which should never be other than NETMBX and TMPMBX). Use with caution.
If this value is zero the use of persistant DCL processes is disabled. If non-zero the zombie process is terminated the specified period after it last processed a request. This helps prevent zombie processes from clogging up a system. See "Scripting Environment" document.
Period a DECnet scripting connection is maintained with the network task. Zero disables connection reuse.
The size of the list used to manage connections for DECnet scripting. Zero effectively allows the server to use as many DECnet scripting connections as demanded.
Controls directory listings. SELECTIVE allows access only to those directories containing a file WWW_BROWSABLE. The WASD HTTPd directory access facility always ignores directories containing a file named WWW_HIDDEN. Also see the [DirWildcard] directive.
Specifies the HTML <BODY> tag for directory listing pages. This allows some measure of site "look-and-feel" in page colour, background, etc. to be employed.
Non-Zero enables HTML file descriptions during listings. Generating HTML descriptions involves opening each HTML file and searching for <TITLE>...</TITLE> and <H1>...</H1> text to generate the description. This is an obviously resource-intensive activity and on busy servers or systems may be disabled. Any non-zero number specifies the number of lines to be searched before quitting. Set to a very high number to search all of files' contents (e.g. 999999).
Allows specification of the directory listing layout. This is a short, case-insensitive string that specifies the included fields, relative placement and optionally the width of the fields in a directory listing. Each field is controlled by a single letter and optional leading decimal number specifying its width. If a width is not specified an appropriate default applies. An underscore is used to indicate a single space and is used to separate the fields (two consecutive works well).
The following shows some examples:
The size of files is displayed by default as 1024 byte kilos. When using the "S:k", "S:m" and "S:f" size modifiers the size is displayed as 1000 byte kilos. If it is prefered to have the default display in 1000 byte kilos then set the directory listing layout using:
If unsure of the kilo value being used check the "<META>" information in the directory listing.
Includes, as <META> information, the software ID of the server and any relevant VMS file information.
When a directory is accessed having no file or type component and there is no welcome page available a directory listing is generated. By default any other directory accessed from this listing has the implied wildcards "*.*" added, consequently forcing directory listings. If enabled, this directive ensures no wildcards are added, so subsequent directories accessed with welcome pages display the pages, not a forced listing.
To prevent browsing through directories (perhaps due to inadvertant mapping) that have file permissions allowing no WORLD access the server stops listing and reports the error the first time a protection violation occurs. This behaviour may be changed to ignore the violation, listing only those files to which it has access.
Allows specification and display of the RMS file owner information.
Directory listings and trees may be pre-expired. That is, the listing is reloaded each time the page is referenced. This is convenient in some environments where directory contents change frequently, but adds considerable over-head and so is disabled by default. Individual directory listings may have the default behaviour over-ridden using syntax similar to the following examples:
If any of the files provided using the [DirReadMeFile] directive are located in the directory the contents are included at the top or bottom of the listing (or not at all). Plain-text are included as plain-text, HTML are included as HTML allowing markup tags to be employed.
Specifies the names and order in which a directory is checked for read-me files. This can be enabled or disabled using the [DirReadme] directive. Plain-text are included as plain-text, HTML are included as HTML allowing markup tags to be employed.
Examples:
This enables the facility to force the server to provide a directory listing by providing a wildcard file specification, even if there is a home (welcome) document in the directory. This should not be confused with the [DirAccess] directive which controls directory listing itself.
Enables or disables connection request host name resolution. This functionality may be expensive (in terms of processing overhead) and make serving granularity coarser if DNS is involved. If not enabled and logging is, the entry is logged against the literal internet address. If not enabled any [Accept], [Reject] or conditional directive, etc., must be expressed as a literal address.
The period for which a host name/address is cached (applies to both client lookup and proxy host lookup).
The number of attempts, at two second intervals, made to resolve a host name/address (applies to both client lookup and proxy host lookup).
An entity tag is a client-opaque string used in strong cache validation. WASD generates this using the on-disk file identification (FID) and binary last-modified date-time (RDT). This is then used as a definitive identifier for a specified on-disk resource fixed in file-system space-time (hmmm, sounds like an episode of Star Trek).
Specifies the URL-format path to an optional, error reporting SSI document or script. See 2.10 Error Reporting. This path can subsequently be remapped during request processing. Optional, space-separated HTTP status codes restrict the path to those codes, with the remainder handled by server-internal reporting.
Provides a short message recommending action when reporting an error to a client. For example, if a document cannot be found it may say:
Enables GZIP encoding of request bodies. See 2.4 GZIP Encoding.
Adjusts the maxiumum period period between GZIP buffer flushes. See 2.4 GZIP Encoding.
Enables GZIP encoding (deflation) for suitable requests and responses. Valid values are 1 for minimum compression (and minimum resource usage) through to 9 for maxiumum compression (and maximum resource usage). The value 9 is recommended. See 2.4 GZIP Encoding.
Enable or disable (default) HTTP/2 for all services. The default for a service follows the global setting. A service must explicitly disable HTTP/2 if that is required.
The maximum permitted size (in octets) of an HTTP/2 frame sent from the client.
The maximum permitted size (in bytes) of a request header sent from the client.
The maximum permitted size (in bytes) of a request header compression table.
The period at which HTTP/2 pings are sent from the server to the client to calculate the (then) Round Trip Time (RTT) of the connection.
Maximum number of concurrent streams (requests) supported by the connection.
Initial flow-control window size (in bytes).
Number of per-node server processes to create and maintain. If set to "CPU" once instance per CPU is created.
Start a multiple instance server already in passive mode.
Enables or disables the request log. Logging can slow down request processing and adds overhead. The log file name must be specified using the /LOG qualifier or WASD_CONFIG_LOG logical name (‘LOGICAL NAMES’ in 10.2 VMS File System Specifications).
One or more (comma-separated if on the same line) internet host/domain names, with "*" wildcarding for host/subdomain matching, requests from which are not placed in any log files. If DNS lookup is not enabled hosts must be expressed using literal addresses (see [DNSLookup] directive). Use for excluding local or web-maintainer's host from logs.
Example:
Number of blocks allocated when when a log file is opened or extended. If set to zero it uses the process default (SET RMS_DEFAULT /EXTEND_QUANTITY).
Provides some or all of the access log file name. See 2.12.2 Log Per-Period.
Specifies one of three pre-defined formats, or a user-definable format. See 2.12.1 Log Format.
When [LogPeriod] or [LogPerService] directives are used to generate multiple log files this directive may be used to modify the naming of the file. See 2.12.5 Log Naming.
Specifies a period at which the log file is changed. See 2.12.2 Log Per-Period.
When multiple instances are configured (see Instances and Environments of WASD Features and Facilities) create a separate log for each. This has significant performance advantages. See 2.12.4 Log Per-Instance.
When multiple services are specified () a separate log file will be created for each if this is enabled. See 2.12.3 Log Per-Service.
When generating a log name do not make the port number part of it. This effectively provides a single log file for all ports provided against a host name (e.g. a standard HTTP service on port 80 and an SSL service on port 443 would have entries in the one file). See 2.12.3 Log Per-Service.
After an access log record fails to write all subsequent requests return a 503 service unavailable response until records can be successfully written again. This can be used to prevent access to server resources unless an access audit log is available.
Allows monitoring via the HTTPDMON utility. Adds slight request processing overhead.
Report to operator log and any enabled operator console (see [OpcomTarget]) server administration directives originating from the Server Administration Menu, for example path map reload, server restart, etc.
Report events related to authentication/authorization. For example username-password validation failures.
Report HTTPD/DO=directive control events, both the command-line directive and the server's response.
Report events concerning the server itself. For example, server startup and exit (either normally or with error status).
Report events related to proxy server cache maintenance. For example, the commencement of file cache reactive and proactive purging, the conclusion of this purge, both with cache device statistics.
This enables OPCOM messaging and specifies the target for the OPCOM reports. This must be set to a target to enable OPCOM messages, irrespective of the setting of any of the other [Opcom...] directives. These messages are added to SYS$MANAGER:OPERATOR.LOG and displayed at the specified operator's console if enabled (using REPLY/ENABLE=target). The operator log provides a "permanent" record of server events. Possible settings include CENTRAL, NETWORK, SECURITY, OPER1 … OPER12, etc.
Pipelining refers to multiple requests being sent over an assumed persistent connection without waiting for the response from previous requests. Such behaviour with capable clients and servers can significantly reduce response latency.
IP port number for server to bind to. For anything other than a command-line server control this parameter is overridden by anything supplied via the [Service] (deprecated) directive.
The maximum number of concurrent client request being processed before a "server too busy right now ... try again shortly" error is returned to the client. If not explicitly set this defaults to the same value as [ConnectMax]. This directive allows a larger number of persistent connections to be maintained than are concurrently being processed at any given moment.
Enables or disables proxy caching on a whole-of-server basis, irrespective of any proxy services that might be configured for caching.
Maximum size of a cache file in kilobytes before it will not be cached.
Negative (unsuccessful) responses are cached for this period.
Hour of day for routine cache purge (00-23).
Interval in minutes between checking space availablility on cache device. If space is not available a reactive purge is initiated.
Organization of directories on the proxy cache device. The first provides a single level structure with a possible 256 directories at the top level and files organized immediately below these. For versions of VMS prior to V7.2 exceeding 256 files per directory, or a total of approximately 65,000 files, incurs a significant performance penalty for some directory operations. The second organization involves two levels of directory, each with a maximum of 64 directories. This allows for approximately 1,000,000 files before encountering the 256 files per directory issue.
The maximum percentage in use on the cache device before a reactive purge is scheduled. If device usage exceeds this limit no more cache files are created.
The percentage by which the cache device usage is attempted to be reduced when a reactive purge is initiated.
Prevents pragma reloads actually retrieving the file from the source host again until the period expires. This is designed to limit concurrent or repeated reloads of files into the cache unecessarily. Thirty seconds is probably an adequate period balancing effect against a user legitimately needing to recache the document.
A list of comma-separated integers representing the sequence of last accessed period in hours used during a progressive reactive purge.
A list of comma-separated integers representing the sequence of age in hours used when determining whether a cache file's contents should be reloaded.
The maximum number of established connections that are maintained to remote servers.
Period for which the established connections persist. At expiry the connection is closed.
Period for which the proxy server will attempt to establish a network connection to the origin (remote) server.
BY enables the addition of a proxy request header line providing information that the request has been forwarded by another agent. The added header line would look like "Forwarded: by http://server.name.domain (HTTPd-WASD/n.n.n OpenVMS/AXP Digital-TCPIP SSL)". If the FOR variant is used the field included the host name (or ADDRESS) the request is being forwarded on behalf of, as in "Forwarded: by http://server.name.domain (HTTPd-WASD/n.n.n OpenVMS/AXP Digital-TCPIP SSL) for host.name.domain".
When the server is resolving the name of a remote host the request may timeout due to up-stream DNS server latencies. This parameter allows a number of retries, at five second intervals, to be enabled.
Enables or disables the server process log reporting siginificant proxy processing events, such as cache maintenance activity.
Enables or disables the server process log reporting of proxy caching activity.
Enables or disables proxy serving on a whole-of-server basis, irrespective of any proxy services that might be configured.
When enabled propagates all request fields provided by the client through to the proxied server. When disabled only propagates fileds that WASD recognises.
Obscure functionality; see WASD Proxy Service feature.
Enables the addition of a proxy request header line providing the host name on behalf of which the request is being proxied. The added header line would look like "X-Forwarded-For: host.name.domain". THE ADDRESS variant provides the IP address, and the UNKNOWN variant substitutes "unknown" for the host. This field is degined to be compatible with the Squid de facto standard field of the same name. Any request with an existing "X-Forwarded-For:" field has the local information appended to the existing as a comm-separated list. The first host in the field should be the original requesting client.
Record format for a non-text HTTP POST or PUT upload into the file-system. Has a per-path equivalent. The precedence for determining the created file record format is [AddType] RFM:, then any per-path PUT=RFM= mapping rule, then [PutBinaryRFM], then the default of UDF.
Maximum size of an HTTP POST or PUT method request in Kilobytes. Has a per-path equivalent.
File created using the POST or PUT methods have the specified version limit applied.
Enable regular expression matching. With the possibility of the reserved character "^" being used in existing mapping rules regular expression string matching (4. String Matching) is only available after enabling this directive.
The default syntax is POSIX EGREP but can be specified by substituting for ENABLED one of the following keywords; AWK, ED, EGREP, GREP, POSIX_AWK, POSIX_BASIC, POSIX_EGREP, POSIX_EXTENDED, POSIX_MINIMAL_BASIC, POSIX_MINIMAL_EXTENDED, SED. When changed from the default enabled (WASD) case-insensitivity is lost.
One or more (comma-separated if on the same line) internet host/domain names, with "*" wildcarding for host/subdomain matching, to be explicitly denied access. If DNS lookup is not enabled hosts must be expressed using literal addresses (see [DNSLookup] directive). Also see the [Accept] directive. Reject directives have precedence of Accept directives. The Reject directive may be used multiple times.
Example:
Only ever supply basic information in a report (2.10 Error Reporting).
Includes in detailed reports, as <META> information, the software ID of the server and any relevant VMS file information.
The server can keep a list of the most recent requests accessible from the Server Administration page. This value determines the number kept. Zero disables the facility. Each retained request consumes 256 bytes and adds a small amount of extra processing overhead.
Enables and disables all scripting mechanisms. This includes CGI and CGIplus, DECnet-based OSU and CGI, and SSI directives that DCL processes to provide <--#dcl -->, <--#exec -->, etc.
Specifies the URL-format path to the default query-string keyword search script. This path can subsequently be remapped during request processing.
Example:
Provides a list of file types that are excluded from an implied keyword search. This is useful for client-side (browser-side) active processing that may require a query string to pass information. This query string would normally be detected by the server and if not in a format to be meaningful to itself is then considered as an implied (HTML <ISINDEX>) keyword search, with the approriate script being activiated.
Example:
Enable the Secure Sockets Layer (SSL) Transport Layer Security (TLS) if the server has been built with that option. See Transport Layer Security of WASD Features and Facilities).
Specifies the contact email address for server administration issues. Included as a "mailto:" link in the server signature if [ServerSignature] is set to email.
Specifies the HTML <BODY> tag for server administration and administration report pages. This allows some measure of control over the "look-and-feel" of page and link colour, etc.. for the administrator.
Specifies the HTML <BODY> tag for server error and other report pages. This allows some measure of site "look-and-feel" in page colour, background, etc. to be maintained.
The server signature is a short identifying string added to server generated error and other report pages. It includes the server software name and version, along with the host name and port of the service. Setting this to email makes the host name a mailto: link containing the address specified by the [ServerAdmin] directive.
This parameter allows SSL, multi-homed hosts and multiple port serving to be specified.
Provides a default path for reporting a virtual host does not exist, see 2.3.2 Unknown Virtual Server.
Number of bytes allocated at the device-driver level for a network connection receive buffer. See VMS Server Account in WASD Install.
Number of bytes allocated at the device-driver level for a network connection send buffer. Later versions of TCP/IP Services seem to have large default values for this. MultiNet and TCPware are reported to improve transfers of large responses by increasing low default values. See VMS Server Account in WASD Install.
Enables or disables Server Side Includes (HTML pre-processing).
Enables or disables Server Side Includes (HTML pre-processing) file access counter.
Enables or disables Server Side Includes (HTML pre-processing) DCL execution functionality.
SSI source files a completely read into memory before processing. This allows the maximum size to be expanded beyond the default.
TLS/SSL server certificate file path.
A colon-separated list (OpenSSL syntax) of TLS/SSL ciphers allowed to be used by clients to connect to SSL services. The use of this parameter might allow the selection of stronger ciphers to be forced to be used or the connection not allowed to procede.
TLS/SSL multiple WASD instance, shared session cache. Maximum number of shared records.
TLS/SSL multiple WASD instance, shared session cache. Size in bytes of each individual record.
TLS/SSL server certificate private key file path. The private key is commonly enbedded into the certificate file.
Alphanumeric flags supported by WASD or hexadecimal value applied to the SSL option of OpenSSL.
Single WASD instance, shared session cache. Maximum number of records. Records are dynamically sized.
The default maximum period for session reuse is five minutes. This may be set globally using the this directive or on a per-service basis using the per-service equivalent [ServiceSSLsessionLifetime].
When non-zero represents the number of seconds, or maximum age, of a HSTS "Strict-Transport-Security:" response header field. See Transport Layer Security of WASD Features and Facilities. There is an equivalent per-service directive.
To access this service a client must provide a verified CA client certificate.
Specifies the location of the collection of Certificate Authority (CA) certificates used to verify a peer certificate (VMS file specification).
When a client certificate is requested for authentication via TLS/SSL renegotiation this is the maximum kilobytes POST/PROPFIND/PUT data buffered during the renegotiation. There is an equivalent per-service directive.
Level through a certificate chain a client is verified to.
The abbreviation for the TLS/SSL protocol version allowed to be used to connect to an SSL service. Using the directive a service may select prefered protocols.
Enables or disables automatic conversion of VARIABLE record format documents (files) to STREAM-LF, which are much more efficient with this server. The integer is the maximum size of a file in kilobytes that the server will attempt to convert. Zero disables any conversions.
(Retired in v5.3, mapping SET rule provides this now, see 10.5.5 SET Rule).
The maximum period of time before an idle HTTP/2 connection is issued with a GOAWAY frame. An idle HTTP/2 connection is one where it has not processed a request.
Period allowing a connection request to be in progress without submitting a complete request header before terminating it.
The period a persistent connection with the client is maintained after the conclusion of a request. Connection persistence improves the overall performance of the server by reducing the number of discrete TCP/IP connections that need to be established.
Period allowing request output to continue without any increase in the number of bytes transfered. This directive is targeted at identifying and eliminating requests that have stalled.
Period allowing a request to be output before terminating it. This directive sets an absolute maximum time a request can continue to receive output.
Enable WEBdav on a server-wide basis (see WebDAV of WASD Features and Facilities).
Enable WebDAV locking.
Ancestor directory locking depth.
Set default locking period.
Maximum locking period.
Location of metadata files.
Enable disk quota reporting.
Specifies the names and order in which a directory is checked for home page files. If no home page is found a directory listing is generated.
Dynamic home pages (script or interpreter engine driven, e.g. Perl, PHP) may be deployed using a combination of the [Welcome] and [DclScriptRunTime] directives.
When enabled considers www.host.name and host.name to be the same virtual service. If a request being processed has a virtual host of www.host.name and the service matching, rule matching or authentication matching process encounters a host.name virtual service it is considered match. A request with a virtual host of host.name does not match a service of www.host.name.
7.1Specific Services |
7.2Generic Services |
7.3SSL Services |
7.4Administration Services |
7.5IPv4 and IPv6 |
7.6To www. Or Not To www. |
7.7Service Directives |
7.8Directive Detail |
7.9Administration |
7.10Examples |
↩︎ | ↖︎ | ↑︎ | ↘︎ | ↪︎ |
By default, the logical name WASD_CONFIG_SERVICE locates a common service configuration file. The service configuration file is optional. If the WASD_CONFIG_SERVICE logical is not defined or the file does not exist service configuration is made using the WASD_CONFIG_GLOBAL [Service] (deprecated) directives. For simple sites, those containing one or two services, the use of a separate service configuration file is probably not warranted. Once the number begins to grow this file offers a specific management interface for those services.
Precedence of service specifications:
WASD services are also known as virtual servers or virtual hosts and can provide multiple, autonomous sites from the one HTTP server. Services can each have an independent IP address or multiple virtual sites share a single or set of multiple IP addresses. Whichever the case, the host name entered into the browser URL must able to be resolved to the IP address of an interface configured on the HTTP server system. There is no design limit to the number of services that WASD can support. It can listen on any number of IP ports and for any number of virtual services for any given port.
The server must be able to resolve its own host name/address. It is not unknown for completely new systems to have TCP/IP configuration overlooked. The server must also be able to resolve the IP addresses of any configured virtual services (2.3 Virtual Services). Failure to do so will result in the service not being configured. To avoid startup issues in the absence of a usable DNS it is suggested that for fundamental, business-critical or otherwise important services, static entries be provided in the system TCP/IP agent's local database.
Changes to the service configuration file can be validated at the command-line before restart. This detects and reports any syntactical and fatal configuration errors but of course cannot check the intent of the rules.
In common with other configuration files, directives associated with a specific virtual services are introduced using a double-bracket delimited host specification (2.3 Virtual Services). When configuring a service the following three components specify the essential characteristics.
These WASD_CONFIG_SERVICE examples illustrate the directive.
A generic service is one that specifies a scheme and/or port but no specific host name. This is useful in a cluster where multiple systems all provide a basic service (e.g. a port 80 service). If the host name is omitted or specified as an asterisk the service substitutes the system's IP host name.
See Transport Layer Security of WASD Features and Facilities.
Multiple virtual SSL services (https:) sharing the same certificate can essentially be configured against any host name (unique IP address or alias) and/or port in the same way as standard services (http:). Services requiring unique certificates can only be configured for the same port number against individual and unique IP addresses (i.e. not against aliases). This is not a WASD restriction, it applies to all servers for significant SSL technical reasons.
For example, unique certificates for https://www.company1.com:443/ and https://www.company2.com:443/ can be configured only if COMPANY1 and COMPANY2 have unique IP addresses. If COMPANY2 is an alias for COMPANY1 they must share the same certificate. During startup service configuration the server checks for such conditions and issues a warning about "sharing" the service with the first configured.
When multiple instances are configured Server Administration page access, in common with all request processing, is automatically shared between those instances. There are occasions when consistent access to a single instance is desirable. The [ServiceAdmin] directive indicates that the service port number should be used as a base port and all instances create their own service with unique port for access to that instance alone. The first instance to create an administration service uses the specified port, or the next successive if it's already in use, the next instance will use the next available port number, and so on. A high port number should be specified. The Server Administration page lists these services for all server instances in the cluster. This port configuration is not intended for general request activity, although with appropriate mapping and other configuration there is nothing specifically precluding the use (remembering that the actual port in use by any particular instance may vary across restarts). In all other respects the services can (and should) be mapped, authorized and otherwise configured as any other.
Both IP version 4 and 6 are concurrently supported by WASD. All networking functionality, service creation, SSL, proxy HTTP, proxy FTP and RFC1413 authorization is IPv6 enabled. If system TCP/IP services do not support IPv6 the expected error would be
Server configuration handles the standard dotted-decimal addresses of IPv4, as well as "normal" and "compressed" forms of standard IPv6 literal addresses, and a (somewhat) standard variation of these that substitutes hyphens for the colons in these addresses to allow the colon-delimited port component of a "URL" to be resolved. Alteratively, use the de facto standard method of enclosing the IPv6 address within square brackets, followed by any port component.
Normal | Compressed |
---|---|
1070:0:0:0:0:800:200C:417B | 1070::800:200C:417B |
0:0:0:0:0:0:13.1.68.3 | ::13.1.68.3 |
0:0:0:0:0:FFFF:129.144.52.38 | ::FFFF:129.144.52.38 |
hyphen-variants | |
1070-0-0-0-0-800-200C-417B | 1070--800-200C-417B |
0-0-0-0-0-0-13.1.68.3 | --13.1.68.3 |
0-0-0-0-0-FFFF-129.144.52.38 | --FFFF-129.144.52.38 |
In common with all virtual services, if a connection can be established with the system and service port the server can respond to that request. The first example binds a service to accept IPv4 connections for any address, while the second the same for IPv6 (and for IPv4 if the interface has IPv4 configuration).
If a service needs to be bound to a specific IP address then that can be specified using the [ServiceBind] directive using any of the literal address formats described above.
TCP/IP Services for OpenVMS does not provide an asynchronous name resolution ACP call for IPv6 as it does for IPv4. This means that dynamic name resolution in IPv6 environments is (currently) an issue. See the server code module [SRC.HTTPD]TCPIP6.C for further detail and workarounds. Let's hope this significant deficiency in VMS' IPv6 support is addressed sooner than later!
In the twenty-first century the www. prefix to Web services is largely redundant. Generally www.host.name and host.name are treated as synonymous. WASD conditionals often need to distinguish precisely on the service name and in some cases this can mean a service for the www.host.name and the host.name.
The WASD global configuration directive
Where a service directive has an equivalent configuration directive (e.g. error report path) the service directive takes precedence. This allows specific virtual services to selectively override the generic configuration.
Configuration keywords equivalent to many of these WASD_CONFIG_SERVICE directives but usable against the deprecated WASD_CONFIG_GLOBAL [Service] directive and the /SERVICE qualifier are available for backward compatibility. See section Command Line Parameters in source file [SRC.HTTPD]SERVICE.C for a list of these keywords.
Some of these directives control the behaviour of proxy services. Other directive are Secure Sockets Layer (SSL) specific.
Specifies the scheme, host name (or asterisk) and port of a service.
Marks the port as administration service (7.4 Administration Services).
If the system has a multi-homed network interface this binds the service to the specific IP address and not to INADDR_ANY. Generally this will not be necessary. The literal address may be in IPv4 dotted-decimal or IPv6 normal or compressed hexdecimal.
Specifies the HTML <BODY> tag for server error and other report pages. This allows some measure of site "look-and-feel" in page colour, background, etc. to be maintained.
Enables a proxy service to originate HTTP-over-SSL requests. This is different to the CONNECT service enabled using [ServiceProxySSL]. It allows requests to be gatewayed between standard HTTP and Secure Sockets Layer.
Location of client certificate file if required to authenticate client connection.
Location of client private key file if required to authenticate client connection.
A comma-separated list of SSL ciphers to be used by the gateway to connect to SSL services. The use of this parameter might allow the selection of stronger ciphers to be forced to be used or the connection not allowed to procede.
Unless this directive is enabled the Certificate Authority (CA) used to issue the service's certificate is not verified. Requires that a CA file be provided. See note in [ServiceClientSSLcipherList] above.
Specifies the location of the collection of Certificate Authority (CA) certificates used to verify the connected-to server's certificate (VMS file specification). See note in [ServiceClientSSLcipherList] above.
The abbreviation for the SSL protocol version to be used to connect to the SSL service. See note in [ServiceClientSSLcipherList] above.
Specifies the URL-format path to an optional, error reporting SSI document or script (2.10 Error Reporting). This path can subsequently be remapped during request processing.
When HTTP/2 is enabled globally this allows an HTTP/1.n-only service to be defined.
See HTTP/2 of WASD Features and Facilities.
Per-service access log format. See .
When request logging is enabled then by default all services are logged. This directive allows logging to be suppressed for this service.
The default behaviour when a non-SSL HTTP request is begun on an SSL service is to return a 400 error and short message. This directive instead redirects the client to the specified non-SSL service. The parameter can be an optional scheme (i.e. http:// or https://), optional full host name or IP address with optional port, or only a colon-delimited port number which will redirect using the current service name. A single colon is the minimum parameter and redirects to port 80 on the current service name. The default redirect code is 307 but this can be changed by providing a leading 301 or 302.
Enables and disables proxy request processing for this service.
Uses cookies to allow the proxy server to make every effort to relay successive requests from a given client to the same origin host. This is also known as client to origin affinity or proxy affinity capability.
Makes a proxy service require authorization before a client is allowed access via it. CHAIN allows an up-stream proxy server to request authorization. LOCAL enables standard server authorization. NONE disables authorization (default). PROXY enables HTTP proxy authorization. authentication.
Enables and disables proxy caching for a proxy service.
Specifies the next proxy host if chained.
Credentials for the up-stream proxy server (BASIC authentication only); in the format username:password.
Transfers octets through the proxy server. FIREWALL accepts a host and port specification before connecting. CONNECT is the traditional CONNECT protocol. RAW connects to a configured host an port.
Specifies the service as providing proxying of SSL requests. This is sometimes refered as a "CONNECT" service. This proxies "https:" requests directly and is different to the HTTP-to-SSL proxying enabled using [ServiceProxyHttpSSL].
Enable "RawSocket" processing on the service. See the chapter on WebSocket scripting in WebSocket in WASD Web Services - Scripting
Non-zero enables service sharing with an SSH server and sets the number of seconds for input timeout.
Specifies the location of the SSL certificates (VMS file specification).
A colon-separated list (OpenSSL syntax) of TLS/SSL ciphers allowed to be used by clients to connect to SSL services. The use of this parameter might allow the selection of stronger ciphers to be forced to be used or the connection not allowed to procede.
Specifies the location of the SSL private key (VMS file specification).
The default maximum period for session reuse is five minutes. This is the per-service equivalent of the global directive [SSLsessionLifetime].
When non-zero represents the number of seconds, or maximum age, of a HSTS "Strict-Transport-Security:" response header field. See Transport Layer Security of WASD Features and Facilities. There is an equivalent global directive.
To access this service a client must provide a verified CA client certificate.
Specifies the location of the collection of Certificate Authority (CA) certificates used to verify a peer certificate (VMS file specification).
When a client certificate is requested for authentication via TLS/SSL renegotiation this is the maximum kilobytes POST/PROPFIND/PUT data buffered during the renegotiation. There is an equivalent global directive.
Level through a certificate chain a client is verified to.
The abbreviation for the TLS/SSL protocol version allowed to be used to connect to an SSL service. Using the directive a service may select prefered protocols.
A service configuration file can be maintained using a simple text editor and WASD_CONFIG_SERVICE.
Alternatively the Server Administration facility may be used When using this interface for the first time ensure the WASD_CONFIG_SERVICE logical is correctly defined. If the file did not exist at server startup any services will have been created from the WASD_CONFIG_GLOBAL [Service] directive. These will be displayed as the existing services and will be saved to the configuration file the first time it is saved. Changes to the service configuration file require a server restart to put them into effect.
The [IncludeFile] is a directive common to all WASD configuration, allowing a separate file to be included as a part of the current configuration (2.1 Include File Directive).
Not all configuration directives may be shown depending on the type of service. For instance, unless a service is configured to provide proxy, only the [ServiceProxy] directive is displayed. To fully configure such a service enable it as proxy, save the file, then reload it. The additional directives will now be available.
There is always one empty service displayed each time the configuration menu is generated. This information may be changed appropriately and then saved to add new services to the configuration (of course, these will not be available until the server is restarted). To configure multiple new services add one at a time, saving each and reloading the file to provide a new blank service.
8.1Behaviour |
8.2Message File Format |
8.3Multiple Language Specifications |
8.4Supplied Message Files |
↩︎ | ↖︎ | ↑︎ | ↘︎ | ↪︎ |
By default, the logical name WASD_CONFIG_MSG locates the global message configuration file. A text editor may be used to modify this configuration file. Changes require a server restart to put them into effect.
Message configuration is provided for two purposes.
Not all messages provided by the WASD server are customizable, only those generated for non-administrative content. As the WASD server can also report using information derived from the standard VMS message service (via sys$getmsg()) it is assumed a language-local implementation of this is in use as well. Unfortunately for the non-first-language-English Web and system administrators, the menus and messages used for administration purposes, etc., are still only in English. The intent of this facility is to provide non-administration clients only with a more familiar language environment.
Also note that the message database only applies to messages generated by the server, not to any generated by scripts, etc.
Changes to the message configuration file can be validated at the command-line before restart. This detects and reports any syntactical and fatal configuration errors but of course cannot check the intent of the rules.
When an error, or other message or string, needs to be provided for the client the message database is accesssed using the following algorithm.
By default, the system-table logical name WASD_CONFIG_MSG locates a common
message file, unless an individual message file is specified using a job-table
logical name. Simple editing of the message file changes the messages (after a
server restart, of course). Comment lines may be included by prefixing them
with the hash character ("#"), and lines continued by ensuring the last
character is a backslash ("^"). The server will concurrently support an
additional 3 languages to the base English (although this can be increased by
recompilation
As illustrated below the message file comprises a series of sections.
Directives enclosed by square-brackets provide information to the message
loader.
The square-bracketed section headings have the following functions.
The base language (the highest numbered, which should always be English)
must have precisely the right number of messages required by the server, too
few or too many and the server will not start! Additional languages do not
have to reassign every message! The base language will supply any not
assigned. A message number of zero is disabled and completely ignored.
If messages contain HTML tags that markup must not interfere with the
general HTML page it is used within.
Some messages are a composite of multiple strings each of which is used on a
different part of the one page (e.g. for the [upd] edit-page). Each of the
strings is delimited by the vertical bar "|". Care must be taken when
customizing these strings that the overall number stays the same and that the
length of each does not become excessive. Although it will not disrupt the
server it may significantly disrupt the page layout.
All message numbers must be included. To provide an empty string for any
one message (not recommended) provide the line with nothing following the
message number.
Multiple language messages can be specified in two ways:
Language availability is specified through the use of [Language] directives.
These must be numbered from 1 to the count of those supplied. The highest
numbered language must have the complete set of messages for this is the
fallback when obtaining any message (this would normally be "en"). The
[Language] may be specified as a comma-separated list of equivalent or similar
specifications, which during request processing will be matched against a
client specified list of accepted-languages one at a time in specified order.
A wildcard may be specified which matches all fitting the template. In this
manner a single language can be used also to match minor variants or language
specification synonyms.
Note that the messages for each language must use the *first* language
specification provided in the [Language] list. In the example above all
messages for language 1 would be introduced using 'es', for language 2 with
'de' and for language 3 with 'en'.
With this approach a logical name containing multiple file names is defined
(more commonly described as a logical search list). The final file specified
must contain the full message set. Files specified prior to this, can contain
as many or as few of the full set as is desired. A [Language] number does not
need to be specified as they are processed in the order the logical name
specifies them in. Other language file directives are required.
The following is an example of a logical name providing the same three
languages in the examples above.
The file contents would be as follows (very contrived examples :-)
The major advantage of maintaining multiple files in this way is there
is no need to merge files when a new revision is required. Just update the
version number and add any new required messages to the existing secondary
file.
Any non-English message files that are provided to the author will be
included for general use (please take the time to support this endeavour) in
the WASD_ROOT:[EXAMPLE]
directory.
Note that message files can become out-of-date as server versions change,
requiring modifications to the message database. Check the version information
and/or comments at the top of candidate message files, however even slightly
dated files may serve as a good starting point for a locale-specific message
base.
Note
Care must be taken with the message file or the server may refuse to start!
Worst-case; the WASD_CONFIG_MSG.CONF message file may be copied from
[EXAMPLE].
8.3Multiple Language Specifications
Within The One File
Multiple Files - Multivalued Logical Name
8.4Supplied Message Files
9.1Non-File Content Caching |
9.2Permanent and Volatile |
9.3Cache Suitability Considerations |
9.4Cache Content Validation |
9.5Cache Configuration |
9.6Cache Control |
9.7Circumventing The Cache |
↩︎ | ↖︎ | ↑︎ | ↘︎ | ↪︎ |
WASD HTTPd provides an optional, configurable, monitorable file data and revision time cache. File data, so that requests for documents can be fulfilled without reference to the underlying file system, potentially reducing request latency and more importantly improving overall server performance and system impact, and file revision time, so that requests specifying an "If-Modified-Since:" header can also benefit from the above. Files are cached using a hash derived from the VMS file-system path equivalent generated during the mapping process (i.e. represents the file name) but before any actual RMS activity. WASD can also cache the content of responses from non-file sources. This can be useful for reducing the system impact of frequently accessed, dynamically generated, but otherwise relatively static pages. These sources are cached using a hash derived from virtual service connected to and the request URI.
Caching, in concept, attempts to improve performance by keeping data in storage that is faster to access than the usual location. The performance improvement can be assessed in three basic ways; reduction of
This cache is provided to address all three. Where networks are particularly responsive a reduction in request latency can often be noticeable. It is also suggested a cache "hit" may consume less CPU cycles than the equivalent access to the (notoriously expensive) VMS file system. Where servers are particularly busy or where disk subsystems particularly loaded a reduction in the need to access the file system can significantly improve performance while simultaneously reducing the impact of the server on other system activities.
A comparison between cached and non-cached performance is provided in in the "Server Performance" section.
Term | Description |
---|---|
hit | Refers to a request path being found in cache. If the data is still valid the request can be supplied from cache. |
flushing | Occurs when the cache becomes full, with older, less frequently used cache entries being removed from the cache and replaced by other files. |
loading | Refers to reading the contents of a file into cache memory. |
permanent | These entries are loaded once and remain in the cache until it is explicitly purged by the administrator or the the server is restarted. They are not flushed or revalidated. |
revalidate | Compare the cache entrys size and modification date-time to the file it represents in the file-system. Obviously a difference indicates the content has changed. |
valid | The file from which the cached data was originally read has not had its revision date changed (the implication being the file contents have not changed). |
volatile | Entries have the original file periodically checked for modification and are reloaded if necessary. They can also be flushed if demand for space requires it. |
The WASD cache was originally provided to reduce file-system access (a somewhat expensive activity under VMS). With the expansion in the use of dynamically generated page content (e.g. PHP, Perl, Python) there is an obvious need to reduce the system impact of some of these activities. While many such responses have content specific to the individual request a large number are also generated as general site pages, perhaps with simple time or date components, or other periodic information. Non-file caching is intended for this type of dynamic content.
Revalidation of non-file content is fraught with a number of issues and so is not provided. Instead the cache entry is flushed on expiry of the [CacheValidateSeconds], or as otherwise specified by path mapping, and the request is serviced by the content source (script, PHP, Perl, etc.) with the generated response being freshly cached. All of the considerations described in 9.4 Cache Content Validation apply equally to file and non-file content.
Determining which non-file content is cached and which not, and how long before flushing, is done using mapping rules (10.5.5 SET Rule). The source of non-file cache content is specified using one or a combination of the following SET rules against general or specific paths.
A good understanding of site requirements and dynamic content sources, along with considerable care in specifying cache path SETings, is required to cache dynamic content effectively. It is especially important to get the content revalidation period appropriate to the content of the pages. This is specified using the following path SETings.
For example. To cache the content of PHP-generated home pages that contain a time-of-day clock, resolving down to the minute, would require a mapping rule similar to the following.
The WASD file cache provides for some resources to be permanently cached while others are allowed to be moved into and out of the cache according to demand. Most sites have at least some files that are fundamental components of the site's pages, are rarely modified, commonly accessed, and therefore should be permanently available from cache. Other files are modified on a regular or ad hoc basis and may experience fluctuations in demand. These more volatile resources should be cached based on current demand.
Volatile caching is the default with the site administrator using mapping rules to indicate to the server which resources on which paths should be permanently cached (9. Cache Configuration).
Although permanent and volatile entries share the same cache structure and are therefore subject to the configuration's maximum number of cache entries, the memory used store the cached file data is derived from separate pools. The total size of all volatile entries data is constrained by configuration. In contrast there is no configuration limit placed on the quantity of data that can be cached by permanent entries. One of the purposes of the permanent aspect of the cache is to allow the site administrator considerable discretion in the configuration of the site's low-latency resources, no matter how large or small that might be. Of course there is the ultimate constraint of server process and system virtual memory limits on this activity. It should also be kept in mind that unless sufficient physical memory is available to keep such cached content in-memory the site may only end up trading file-system I/O for page file I/O.
A cache is not always of benefit! the cost may outweigh the return.
Any cache's efficiencies can only occur where subsets of data are consistently being demanded. Although these subsets may change slowly over time a consistent and rapidly changing aggregate of requests lose the benefit of more readily accessible data to the overhead of cache management, due to the constant and continuous flushing and reloading of cache data. This server's cache is no different, it will only improve performance if the site experiences some consistency in the files requested. For sites that have only a small percentage of files being repeatedly requested it is probably better that the cache be disabled. The other major consideration is available system memory. On a system where memory demand is high there is little value in having cache memory sitting in page space, trading disk I/O and latency for paging I/O and latency. On memory-challenged systems cache is probably best disabled.
To help assessment of the cache's efficiency for any given site monitor the Server Administration facility's cache report.
Two sets of data provide complementary information, cache activity and file request profile.
This summarizes the cache search behaviour, in particular that of the hash table.
The "searched" item, indicates the number of times the cache has been searched. Most importantly, this may include paths that can never be cached because they represent non-file requests (e.g. directory listings). Requests involving scripts, and some others, never attempt a cache search.
The "hit" item, indicates the number of times the hash table directly provided a cached path. This is very efficient.
The "miss" item, indicates the number of times the hash table directly indicated a path was not cached. This is decisive and is also very efficient.
The "collision" item, indicates the number of times multiple paths resolved to the same hash table entry. Collisions require further processing and are far less efficient. The sub-items, "collision hits" and "collision misses" indicate the number of times that further processing resulted in a found or not-found cache item.
A large number of cache misses compared to searches may only indicate a large number of non-cacheable requests and so depending on that further datum is not of great concern. A large proportion of collisions (say greater than 12.5%) is however, indicating either the hash table size needs increasing (1024 should be considered a minimum) or the hashing algorithm in the software need reviewing :-)
This summarizes the site's file request profile.
With the "loads not hit" item, the count represents the cumulative number of files loaded but never subsequently hit. If this percentage is high it means most files loaded are never hit, indicating the site's request profile is possibly unsuitable for caching.
The item "hits" respresents the cumulative, total number of hits against the cumulative, total number of loads. The percentage here can range from zero to many thousands of percent :-) with less than 100% indicating poor cache performance and from 200% upwards better and good performance. The items "1-9", "10-99" and "100+" show the count and percentage of total hits that occured when a given entry had experienced hits within that range (e.g. if an entry has had 8 previous hits, the ninth increments the "1-9" item whereas the tenth and eleventh increments the "10-99" item, etc.)
Other considerations also apply when assessing the benefit of having a cache. For example, a high number and percentage of hits can be generated while the percentage of "loads not hit" could be in the also be very high. The explanation for this would be one or two frequently requested files being hit while most others are loaded, never hit, and flushed as other files request cache space. In situations such as this it is difficult to judge whether cache processing is improving performance or just adding overhead.
The cache will automatically revalidate the volatile entry file data after a specified number of seconds ([CacheValidateSeconds] configuration parameter), by comparing the original file revision time to the current revision time. If different the file contents have changed and the cache contents declared invalid. If found invalid the file transfer then continues outside of the cache with the new contents being concurrently reloaded into the cache. Permanent entries are not subject to revalidation and the associated reloading.
Cache validation is also always performed if the request uses "Cache-Control:" with no-cache, no-store or max-age=0 attributes (HTTP/1.1 directive), or if a "Pragma: no-cache" field (HTTP/1.0 directive). These request directives are often associated with a browser agent reload page function. Hence there is no need for any explicit flushing of the cache under normal operation. If a document does not immediately reflect any changes made to it (i.e. validation time has not been reached) validation (and consequent reload) can be "forced" with a browser reload. Permanent entries are also not subject to this source of revalidation. The configuration directive [CacheGuardPeriod] limits this form of revalidation when used within the specified period since last revalidated. It has a default value of fifteen seconds.
If a site's contents are relatively static the validation seconds could be set to an extended period (say 3600 seconds, one hour) and then rely on an explicit "reload" to force validation of a changed file.
The entire cache may be purged of cached data, both volatile and permanent entries, either from the Server Administration facility or using command line server control.
The cache is controlled using WASD_CONFIG_GLOBAL configuration file and WASD_CONFIG_MAP mapping file directives. A number of parameters control the basics of cache behaviour.
Mapping rules (10.5.5 SET Rule) allow further tailoring of cache behaviour based on request (file) path. Those files that should be made permanent entries are indicated using the cache=perm directive. In the following example all files in the WASD runtime directories (directory icons, help files, etc.) are made permanent cache entries at the same time the path is mapped.
Of course, specified file types as well as specific paths can be mapped in this way. Here all files in the site's /help/ path are made permanent entries except those having a .PS type (PostScript documents).
The configuration directive [CacheFileKBytesMax] puts a limit on individual file size. Those exceeding that limit are considered too large and not cached. It is possible to override this general constraint by specifying a maximum size (in kilobytes) on a per-path basis.
Caching may be disabled and/or enabled for specified paths and subpaths.
The cache may be enabled, disabled and purged from the Server Administration facility. In addition the same control may be exercised from the command-line using
If cache parameters are altered in the configuration file the server must be restarted to put these into effect. Disabling the cache on an ad hoc basis (from menu or command line) does not alter the contents in any way so it can merely be reenabled with use of the cache's previous contents resuming. In this way comparisions between the two environments may more easily be made.
There are often good reasons for bypassing or avoiding the cache. For instance, where a document is being refreshed within the cache revalidation period specified by [CacheValidateSeconds] (9.4 Cache Content Validation). There are two mechanisms available for bypassing or invalidating the file cache.
↩︎ | ↖︎ | ↑︎ | ↘︎ | ↪︎ |
By default, the logical name WASD_CONFIG_MAP locates a common mapping rule file. Simple editing of the mapping file and reloading into the running server changes the processing rules. The [IncludeFile] is a directive common to all WASD configuration, allowing a separate file to be included as a part of the current configuration (2.1 Include File Directive).
Mapping rules are used for a number of different request processing purposes.
Mapping is basically for server-internal purposes only. The only time the path information of the request itself is modified is when a script component is removed. At all other times the path information remains unchanged. Path authorization is always applied to the path supplied with the request.
Rules are given a basic consistency check when loaded (i.e. server startup, map reload, etc.) If there is an obvious problem (unknown rule, missing component, etc., path not absolute) a warning message is generated and the rule is not loaded into the database. This will not cause the server startup to fail. These warning messages may be found in the server process log.
Changes to the mapping configuration file can be validated at the command-line before reload or restart. This detects and reports any syntactical and fatal configuration errors but of course cannot check the intent of the rules.
A server's currently loaded mapping rules may also be interrogated from the Server Administration menu (see Server Administration of WASD Features and Facilities).
The rules are scanned from first towards last, until a matching final rule is encountered (PASS, EXEC, SCRIPT, FAIL, REDIRECT, UXEC and USER) when the mapping pass concludes. Non-final rules (MAP and SET) perform the appropriate action and continue to the next rule. One, two or more passes through the rules may occur due to implicit processing (if the path contains a script component) or by explicit restart (SET map=restart).
The basis of path mapping is string pattern matching, comparing the request specified path, and optionally other components of the request when using configuration conditionals (5. Conditional Configuration), to a series of patterns, usually until one of the patterns matches, at which stage some processing is performed. Both wildcard and regular expression based pattern matching is available. All rules have a template (string pattern to match against the path). Some rules have a result (how to restructure the components matching from the template).
As described in 2.3 Virtual Services virtual service syntax may be used with mapping rules to selectively apply rules to one specific service. If virtual services are configured rule interpretation sees only rules common to all services and those specific to its own service (host address and port). In all other aspects rule interpretation applies as described above.
Naturally, each rule that needs to be processed adds a little to consumed CPU, introduces some latency, and ultimately reduces throughput. The test-bench has shown this to be acceptably small compared to the overall costs of responding to a request. Using the ApacheBench tool on a COMPAQ Professional Workstation XP1000 with 2048MB, VMS V8.3, TCP/IP Service 5.7 and WASD v10.1, with a simple access to /wasd_root/exercise/0k.txt showed approximately 744 requests/second throughput using the following mapping file.
After adding various quantities of the same intervening rule
Rule Count | Requests/S | Throughput |
---|---|---|
0 | 744 | baseline |
100 | 701 | -5.8% |
200 | 665 | -10.6% |
500 | 571 | -23.3% |
1000 | 461 | -38.4% |
Although this is a fairly contrived set-up and actual real-world rule-sets are more complex than this, even one hundred rules is a very large set, and it does indicate that for all intents and purposes mapping rules may be used to achieve desired objectives without undue concern about impact on server throughput.
The VMS file system in mapping rules is always assumed to begin with a device or concealed device logical. Specifying a Master File Directory (MFD) component, the [000000] is completely optional, although always implied. The mapping functions will always insert one if required for correct file system syntax. That is, if the VMS file system mapping of a path results in a file in a top-level directory an MFD is inserted if not explicitly present in the mapping. For example, both of the following paths
Concealed device logicals are created using the following syntax:
The logical name may be multi-valued and provided the DIRECTORY command can be used successfully with them (as described above) should be amenable to WASD directory listing producing equivalent results.
For ODS-2 volumes, when during rule mapping of a path to a VMS file specification an RMS-invalid character (e.g. "+") or syntax (e.g. multiple periods) is encountered a dollar symbol is substituted in an attempt to make it acceptable. This functionality is often useful for document collections imported to the local web originating from, for instance, a Unix site that utilizes non-VMS file system syntax. The default substitution character may be changed on a per-path basis using the SET rule (10.5.5 SET Rule).
OpenVMS Alpha V7.2 introduced a new on-disk file system structure, ODS-5. This brings to VMS in general, and WASD and other Web servers in particular, a number of issues regarding the handling of characters previously not encountered during (ODS-2) file system activities. ODS-2 and ODS-5 volumes should be automatically distinguished by the server however it is possible to force interpretation using a path mapping rule (10.5.5 SET Rule).
There is a standard for characters used in HTTP requests paths and query strings (URLs). This includes conventions for the handling of reserved characters, for example "?", "+", "&", "=" that have specific meanings in a request, characters that are completely forbidden, for example white-space, control characters (0x00 to 0x1f), and others that have usages by convention, for example the "~", commonly used to indicate a username mapping. The request can otherwise contain these characters provided they are URL-encoded (i.e. a percentage symbol followed by two hexadecimal digits representing the hexadecimal-encoded character value).
There is also an RMS standard for handling characters in extended file specifications, some of which are forbidden in the ODS-2 file naming conventions, and others which have a reserved meaning to either the command-line interpreter (e.g. the space) or the file system structure (e.g. the ":", "[", "]" and "."). Generally the allowed but reserved characters can be used in ODS-5 file names if escaped using the "^" character. For example, the ODS-2 file name "THIS_AND_THAT.TXT" could be named "This^_^&^_That.txt" on an ODS-5 volume. More complex rules control the use of character combinations with significance to RMS, for instance multiple periods. The following file name is allowed on an ODS-5 volume, "A-GNU-zipped-TAR-archive^.tar.gz", where the non-significant period has been escaped making it acceptable to RMS.
Of course characters absolutely forbidden in request paths must still be URL-encoded, the most obvious example is the space. RMS will accept the file name "This^ and^ that.txt" (i.e. containing escaped spaces) but the request path would need to be specified as "This%20and%20that.txt".
Unlike for ODS-2 volumes, ODS-5 volumes do not have "invalid" characters, so no processing is performed to ensure RMS compliance.
ODS-5 allows for some file name ambiguity in web-space.
For example the file name
In addition, the two files
To avoid this situation a potentially ambiguous file name containing an escaped period and no type (extension) is ignored by directory listings and WebDAV property lists. When an ambiguous file name is detected it is reported in WATCH reports.
While these sorts of situations are corner-cases it is best to try and avoid interesting file names that can challenge the rather convoluted VMS file-system environment.
When the server generates a path to be returned to the browser, either in a viewable page such as a directory listing or error message, or as a part of the HTTP transaction such as a redirection, the path will contain the URL-encoded equivalent of the canonical form of an extended file specification escaped character. For example, the file name "This^_and^_that.txt" will be represented by "This%20and%20that.txt".
When presenting a file name in a viewable page the general rule is to also provide this URL-equivalent of the unescaped file name, with a small number of exceptions. The first is a directory listing where VMS format has been requested by including a version component in the request file specification. The second is in similar fashion, but with the tree facility, displaying a directory tree. The third is in the navigation page of the UPDate menu. In all of the instances the canonical form of the extended file specification is presented (although any actual reference to the file is URL-encoded as described above).
These are the categories of mapping rules.
If the URL path matches the template, substitute the result string for the path and use that for further rule processing. Both template and result paths must be absolute (i.e. begin with "/").
If the URL path matches the template, substitute the result if present (if not just use the original URL path), processing no further rules.
The result should be a either a physical VMS file system specification in URL format or an HTTP status-code message (see below). If there is a direct correspondance between the template and result the result may be omitted.
An HTTP status-code message can be provided as a result. The server then generates a response corresponding to that status code containing the supplied message. Status-code results should be enclosed in one of single or double quotes, or curly braces. See examples. A 3nn status results in a redirection response with the message text comprising the location. Codes 4nn and 5nn result in an error message. Other code ranges (e.g. 0, 1nn, 2nn, etc.) simply cause the connection to be immediately dropped, and can be used for that purpose (i.e. no indication of why!)
If the URL path matches the template, prohibit access, processing no further rules. The template path must be absolute (i.e. begin with "/").
If the URL path matches the template, substitute the result string for the path. Process no further rules. Redirection rules can provide result URLs in one of a number of formats, each with a slightly different behaviour.
The USER rule maps a VMS user account default device and directory (i.e. home directory) into a request path. That is, the base location for the request is obtained from the VMS systems SYSUAF file. This is usually invoked by a request path in the form "/~username/", see ‘Mapping User Directories’ in 10.9 Conditional Mapping for more detailed information.
If the path matches the template then the result is substituted, with the following conditions. At least one wildcard must be present. The first wildcard in the result substitutes the username's home directory into the path (in place of the "~username"). Any subsequent wildcard(s) substitute corresponding part(s) of the original path.
If the user DANIEL's default device and directory were
Also see WASD Scripting Environment for further information.
The EXEC/UXEC and SCRIPT directives have the variants EXEC+/UXEC+ and SCRIPT+. These behave in exactly the same fashion and simply mark the rule as representing a CGIplus script environment.
The EXEC/UXEC rules maps script directories.
The SCRIPT rules maps script file names. It behaves a little differently to the EXEC rule, essentially supplying in a single rule the effect of a MAP then an EXEC rule.
Both rules must have a template and result, and both must end in a wildcard asterisk. The placement of the wildcards and the subsequent functionality is slightly different however. Both template and result paths must be absolute (i.e. begin with "/").
The EXEC rule requires the template's asterisk to immediately follow the slash terminating the directory specification containing the scripts. The script name follows immediately as part of the wildcard-matched string. For example:
If the URL path matches the template, the result, including the first slash-terminated part of the wildcard-matched section, becomes the URL format physical VMS file specification the script to be executed. What remains of the original URL path is used to create the path information. Process no further rules.
Hence, the EXEC rule will match multiple script specifications without further rules, the script name being supplied with the URL path. Hence any script (i.e. procedure, executable) in the specified directory is accessible, a possible security concern if script management is distributed.
A variation on the "exec" rules allows a Run-Time Environment (RTE) to be mapped. An RTE is a persistant scripting environment not unlike CGIplus. The essential difference is an RTE provides an environment in which a variety of scripts can be run. It is often an interpreter, such as Perl, where the advantages of persistance (reduced response latency and system impact) are available. For more information on RTEs and how they operate see the WASD Scripting Environment document.
The RTE executable is specified in parentheses prefixed to the mapping result, as show in this example:
The SCRIPT rule requires the template's asterisk to immediately follow the unique string identifying the script in the URL path. The wildcard-matched string is the following path, and supplied to the script. For example:
If the URL path matches the template, the result becomes the URL format physical VMS file specification for the DCL procedure of the script to be executed (the default file extension of ".COM" is not required). What remains of the original URL path is used to create the path information. Process no further rules.
Hence, the SCRIPT rule will match only the script specified in the result, making for finely-granular scripting at the expense of a rule for each script thus specified. It also implies that only the script name need precede any other path information.
It may be thought of as a more efficient implementation of the equivalent functionlity using two CERN rules, as illustrated in the following example:
The UXEC rule is an analog to the EXEC rule, except it is used to map user scripts. It requires two mapping asterisks, the first for the username, the second for the script name. It must be used in conjunction with a SET script=as=~ rule. For example:
For further information see ‘User Account Scripting’ in 10.10.1 Using The SYSUAF and the Introduction of WASD Scripting Environment.
It is conventional to locate script images in WASD_ROOT:[AXP-BIN] or WASD_ROOT:[VAX-BIN] (depending on the platform), and procedures, etc. in WASD_ROOT:[CGI-BIN]. These multiple directories are accessible via the single search list logical CGI-BIN.
Script files can be located in area completely outside of the WASD_ROOT tree. Two approaches are available.
Generally directories are specified as locations for script files. This is the more common application, with the EXEC rules used as in this example
Mapping a file type into an EXEC behaviour is also supported. This allows all files within the specified path and with the matching file suffix (extension) to be activated as scripts. Of course a script runtime must be available for the server to be able activate it. The following example demonstrates mapping all files ending in .CGI in the /web/ tree as executable scripts.
The SET rule does not change the mapping of a path, it just sets one or more characteristics against that path that affect the subsequent processing in some way. It is a general purpose rule that conveniently allows the administrator to tell the server to process requests with particular paths in some ad hoc and generally useful fashion. Most SET parameters are single keywords that act as boolean switches on the request, some require parameter strings. Multiple space-separated parameters may be set against against the one path in a single SET statement.
Rule | Description |
---|---|
ACCEPT=LANG= DEFAULT=<language> | sets the default language |
ACCEPT=LANG= CHAR=<character> | sets the delimiting character |
ACCEPT=LANG= VARIANT=<name>|<type> | allows the alternate file-type variant to be specified |
ACCEPT=LANG= (DEFAULT=<language>, CHAR=<character>) | sets both (etc.) |
NOACCEPT=LANG | disables language variant processing (on a subtree for example) |
For detailed configuration information see 2.8 Language Variants.
Rule | Description |
---|---|
ALERT=MAP | generates this alert immediately after path mapping (i.e. before the request actually begins being processed) |
ALERT=AUTH | after authorization (i.e. when any remote username has been resolved) |
ALERT=<integer> | if the response HTTP status matches the specific integer |
ALERT=END | at the conclusion of process (the default) |
NOALERT | cancels alerts on this path (perhaps subpath) |
Rule | Description |
---|---|
[NO]AUTH=ALL | All requests matching this path must have been subject to authorization or fail with a forbidden status. This is a per-path requivalent of implementing the per-server /AUTHORIZE=ALL policy, and is a little "belt and braces" in a certain sense, but does permit a site to further avoid unintended information leakage (in this case through the failure ensure a given path has authorization). |
[NO]AUTH=ONCE | If a request path contains both a script component and a resource component by default the WASD server makes sure both parts are authorized before allowing access. This can be disabled using this path setting. When this is done only the original request path undergoes authorization. |
AUTH=REVALIDATE=<hh:mm:ss> | Authorization is cancelled and the client requested to reenter the username and password if this period expires between authorized requests. Overrides configuration directive [AuthRevalidateUserMinutes]. |
AUTH=SYSUAF= PWDEXPURL=<string> | Parallels the [AuthSysUafPwdExpURL] configuration directive, allowing it to be set on a per-path or virtual service basis. |
Rule | Description |
---|---|
CACHE=NONE | disables caching of files matching this rule |
CACHE=EXPIRES=0 | cancels previous mapped expiry |
CACHE=EXPIRES=DAY | expires on change of day |
CACHE=EXPIRES=HOUR | expires on change of hour |
CACHE=EXPIRES=MINUTE | expires on change of minute |
CACHE=EXPIRES=<period> | sets the expiry period for the entry |
CACHE=GUARD=<period> | sets the guard period (no reload) for the cache entry |
CACHE=MAX=<integer> | cache files up to this many kilobytes (overrides [CacheFileKBytesMax]) |
CACHE=[NO]CGI | cache CGI-compliant (script) responses |
CACHE=[NO]FILE | cache files matching this rule (the default) |
CACHE=[NO]NET | cache any network output |
CACHE=[NO]NPH | cache NPH (non-parse-header script) responses |
CACHE=[NO]SCRIPT | cache both CGI and NPH responses |
CACHE=[NO]SSI | cache SSI document responses |
CACHE=[NO]QUERY | cache (script) regardless of containing a query string |
CACHE=[NO]PERM | permanently cache these files |
Rule | Description |
---|---|
CGIPLUSIN=CC=NONE | no carriage control |
CGIPLUSIN=CC=LF | each record has a trailing line feed (0x0a) |
CGIPLUSIN=CC=CR | a trailing carriage return (0x0d) |
CGIPLUSIN=CC=CRLF | a trailing line feed then carriage return (0x0d0a) |
CGIPLUSIN=[NO]EOF | the end of the record stream is indicated using an end-of-file |
Rule | Description |
---|---|
CLIENT=FORWARDED | Substitute the (first) address from the "Forwarded": request header. Return a 403 status if no "Forwarded:" header present. |
CLIENT=IF=FORWARDED | As above but the absence of a "Forwarded:" request header is not fatal. |
CLIENT=LITERAL=<string> | Substitue the following string. Intended for testing purposes. |
CLIENT=RESET | Reset the substituted client data to the original (up-stream proxy). |
CLIENT=XFORWARDEDFOR | Substitute the (first) address from the "X-Forwarded-For": request header. Return a 403 status if no "X-Forwarded-For:" header present. |
CLIENT=IF=XFORWARDEDFOR | As above but the absence of a "X-Forwarded-For:" request header is not fatal. |
Rule | Description |
---|---|
DIR=[NO]ACCESS | allows directory listing |
DIR=ACCESS=SELECTIVE | allows directory listing if the directory contain the file .WWW_BROWSABLE |
DIR=DELIMIT=<keyword> | header, footer, both, none |
DIR=[NO]ILINK | icon plain-text link can be disabled |
DIR=[NO]IMPLIEDWILDCARD | add wildcards if not in path |
DIR=SORT=<column> | pre-sort a listing |
DIR=STYLE=<keyword> | set the style of a directory listing
|
DIR=TARGET=<string> | open the file in another window
|
DIR=THESE=<filespec> | restrict listing to specified filename(s) |
DIR=TITLE=<keyword> | format the title
of the window (tab)
|
DIR=VERSIONS=<integer> | list the specified maximum number of file versions, or if an asterisk all versions |
DIR=[NO]WILDCARD | allow a directory listing to be "forced" by including wildcards in the path |
Rule | Description |
---|---|
HTML=BODYTAG= | specifies the page <BODY> tag characteristics (e.g. html=bodytag="BGCOLOR=#ffffff") |
HTML=HEADER= | the page header text |
HTML=HEADERTAG= | the <TD> tag characteristics of the header table (e.g. html=headertag="BGCOLOR=#cccccc") |
HTML=FOOTER= | the page footer text |
HTML=FOOTERTAG= | the <TD> tag characteristics of the footer table |
The headertag and footertag directives also allow the full table tag to be specified, allowing greater flexibility with these parts of the page (e.g. html=footertag="<TABLE BORDER=1 CELLPADDING=10 CELLSPACING=0><TR><TD BGCOLOR=#cccccc>".
Rule | Description |
---|---|
HTTP=ACCEPT-CHARSET=<string> | the "Accept-Charset:" field |
HTTP=ACCEPT-LANGUAGE=<string> | the "Accept-Language:" field |
Rule | Description |
---|---|
HTTP2=PROTOCOL=1.1 | send the client an HTTP_1_1_REQUIRED error whcich should cause it to re-request as HTTP/1.1 |
HTTP2=SEND=GOAWAY[=<integer>] | send the client a connection GOAWAY frame with optional error number |
HTTP2=SEND=PING | send the client an HTTP/2 ping |
HTTP2=SEND=RESET[=<integer>] | send the client a stream (request) reset (close) with optional error number |
HTTP2=WRITE=LOW|NORMAL|HIGH | this stream (request) will write to the network at the specified priority relative to other data on the connection |
Rule | Description |
---|---|
[NO]MAP=ELLIPSIS | By default the use of the VMS file specification ellipsis wilcard ("...") is not allowed. This enables this for the path specified. Use with caution. |
[NO]MAP=ONCE | Normally, when a script has been identified during mapping, the resultant path information is also mapped in a second pass. This can be suppressed by SETing the path as MAP=ONCE. The resultant path is then given to the script without further processing. |
MAP=RESTART | Causes an immediate change to the order of rule processing. Instead of the next rule, the first rule in the configuration is processed. This is intended to remove the need for copious repetition in the rule set. A common or set of common processing blocks can be established near the start of the rule set and be given requests from processing points further down in the rules. It is intended to be used only once or perhaps twice and will abort the request if it occurs too often. Can be detected using the restart: conditional (5.3 Conditional Keywords). Use with caution! Injudicious use would make unexpected mappings expected! |
[NO]MAP=ROOT=<string> | Prefixes the results of following rules with the specified path so that they are all subordinate to it. This also populates the DOCUMENT_ROOT CGI variable. See ‘Document Root’ in 2.2 Site Organisation. |
[NO]MAP=SET=IGNORE | All path SETings following an IGNORE are completely ignored (not applied to the mapping or request characteristics) until a subsequent NOINGORE is encountered. |
[NO]MAP=SET=REQUEST | All path SETings following a NOMAP=SET=REQUEST are only applied to the mapping and not to the request's characteristics until a subsequent MAP=SET=REQUEST is encountered. Intended for use during callouts. These can be detected using the callout: conditional (5.3 Conditional Keywords). |
[NO]MAP=URI | Normally mapping is performed on the request path. This SETing replaces the path with the full, raw, request URI (undecoded path plus any query string). This allows subsequent mapping rules to be applied to the full URI and therefore path components to be remapped into query components, and query components into path components (using specified substitution, see 4.4 Expression Substitution). |
Rule | Description |
---|---|
ODS=2 | is basically redundant, because if a path is not indicated as anything else it is assumed to be ODS-2. This can be used for clarity in the mapping rules if required. |
ODS=5 | is used to indicate that a particular path maps to files on an ODS-5 (EFS) volume and so the names may comply to extended specifications. This changes the way file names are processed, including for example the replacement of invalid RMS characters (see below). |
ODS=ADS | is used to process file names that are encoded using the Advanced Server (PATHWORKS 6) schema. |
ODS=NAME=8BIT|UTF8|DEFAULT | When a file is PUT (created) using WebDAV or upload, for non-7bit ASCII file names use native ODS-5 8bit syntax (default) or UTF-8 encoded character sequences. |
ODS=PWK | is used for processing file names encoded using the PATHWORKS 4/5 schema. |
ODS=SMB | is a synonym for ODS=ADS and makes clear the path is also being served by Samba. |
ODS=SRI | for file names encoded using the SRI schema (used by MultiNet and TCPware NFS, FTP and other utilities). |
Rule | Description |
---|---|
PROXY=[NO]AFFINITY | sets client to origin server affinity. |
PROXY=BIND=<ip-address> | makes outgoing proxy requests appear to originate from this IP address. Must be an address that the media can be bound to. |
PROXY=CHAIN=<host:port> | makes outgoing proxy requests chain to this up-stream proxy server. |
PROXY=CHAIN=CRED=<username:password> | provides proxy authentication credentials to an up-stream proxy server. |
PROXY=FORWARDED | controls generatation a proxy "Forwarded:" request
field. This optional field contains information on the proxy server and as a
further option the client name or IP address.
|
PROXY=HEADER=<name>[=<string>] | removes or sets the value of the specified proxied request header. Examples:
|
PROXY=REVERSE=[NO]AUTH | suppresses propogation of any "Authorize" header. |
PROXY=REVERSE=LOCATION=<string> | rewrites the matching "Location:" header field URL of a 302 response from an internal, reverse-proxied server. |
PROXY=REVERSE=[NO]VERIFY | sets a specialized authorization capability. See WASD_ROOT:[SRC.HTTPD]PROXYVERIFY.C for further information. |
PROXY=TUNNEL=REQUEST=<string> | allows the originating end of a WASD tunnel to specify an HTTP request line or even request header to be provided to the tunnel target end when the connection is established. |
PROXY=UNKNOWN | causes the server to propagate all request field provided by the client to the proxied server (by default WASD only propagates those it recognises). |
PROXY=XFORWARDEDFOR=<keyword> | controls generation of a proxy "X-Forwarded-For:" request field. This optional
field (a defacto standard originally from the Squid caching package)
contains the name or IP address of the proxied client.
|
Rule | Description |
---|---|
PUT=MAX=<integer> | * | Maximum number of kilobytes allowed for a request body, if "*" then effectively unlimited (per-path equivalent of the global directive [PutMaxKbytes]). |
PUT=RFM=FIX512|STM|STMCR|STMLF|UDF | When a request body is uploaded into the file-system and the content-type is not text this determines the file record format. The precedence for determining the created file record format is [AddType] RFM:, then any per-path PUT=RFM= mapping rule, then [PutBinaryRFM], then the default of UDF. |
Rule | Description |
---|---|
REPORT=BASIC | include less detail in error message |
REPORT=DETAILED | includes more detail |
REPORT=TUNNEL | brief, non-HTML error messages suitable for proxy tunnel |
REPORT=4<nn>=<nnn> | maps one 400 class HTTP status to another (to conceal the true origins of some error messages) |
Rule | Description | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
RESPONSE=CSP=<parameter> RESPONSE=CSPRO=<parameter> | see 3.10 Content Security Policy (CSP) | ||||||||||
RESPONSE=GZIP=<keyword> | controls generation of GZIPed response bodies (2.4 GZIP Encoding)
| ||||||||||
RESPONSE=HEADER=<parameter> | changes the way in which a response header is generated by the server.
| ||||||||||
RESPONSE=VAR=<parameter> | where a response is being provided from a variable-length record file each
record should be terminated as follows.
|
For scripting detail see the WASD Scripting Environment document.
Rule | Description |
---|---|
SCRIPT=AS=<parameter> | for non-server account scripting this rule allows the user account to be either explicitly specified or substituted through the use of the tilde character "~" or the dollar "$". |
SCRIPT=BIT-BUCKET=<hh:mm:ss> | specifies the period for which a script continues to execute if the client disconnects. Overrides the WASD_CONFIG_GLOBAL [DclBitBucketTimeout] configuration directive. |
[NO]SCRIPT=BODY=DECODE | instructs the server to decode (un-chunk and/or un-GZIP) an encoded request body before transfering it to the script. The script must be aware of this and change its processing accordingly. See 2.4 GZIP Encoding. |
SCRIPT=CONTROL=<string> | Supply the specified string to the CGI processor as if the a script had provided it using a "Script-Control:" response header field. |
SCRIPT=COMMAND=<string> | allows additional parameters and qualifiers to be passed to the script
activation command line. First parameter must be an asterisk to use the server
resolved script command. If the first parameter is not an asterisk it
substitutes for the script activation verb. Subsequent parameters must be as
they would be used on the command line. The following setting
set /cgi-bin/example* script=command="* /ONE /TWO=THREE FOUR"
would result in the hypothetical script being command-line activated
$ EXAMPLE /ONE /TWO=THREE FOUR
|
SCRIPT=CPU=<hh:mm:ss> | specifies that the server should not allow the script to use more than the specified quantity of CPU time. This is approximate, due to the way the server administers scripting. It can serve to prevent scripts from consuming indefinite quantities of system resources. |
SCRIPT=DEFAULT=<string> | sets the default directory for the script environment (a SET DEFAULT immediately prior to script activation). This can be suppressed (for backward compatibility purposes) using a "#" as the target directory. This string is reflected in CGI variable SCRIPT_DEFAULT so that CGIplus script and RTE engines can be informed of this setting for a particular script's environment. Unix syntax paths may also be specified. If the default begins with a "/" character the SET DEFAULT is not performed but the SCRIPT_DEFAULT variable is set appropriately allowing the equivalent of a chdir() to be performed by the scripting environment. |
[NO]SCRIPT=FIND | by default the server always confirms the existance and accessability of a script file by searching for it before attempting to activate it. If it does not exist it reports an error. It may be possible a Run-Time Environment (RTE) may require to access its own script file via a mechanism available only to itself. The server script search may be disabled by SETing the path as nofind, for example "script=nofind". The script path and filename is directly passed to the RTE for it to process and activate. |
SCRIPT=LIFETIME=<hh:mm:ss> | provides a per-path (and hence per-script) value for a script process zombie (idle scripting process) or idle CGIplus and RTE process lifetime. This per-path SETing overrides the respective [DclZombieLifeTime] and [DclCGIplusLifeTime] global directives. |
SCRIPT=PARAM=<name=value> | allows non-CGI environment variables to be associated with a particular script
path. The name component becomes a variable containing the specified value
passed to the script. Multiple, comma-separated name=value pairs may be
specified. The value may be quoted. The following path setting
set /cgi-bin/example*
script=params=(first=one,second="Two (and Three)")
would result in additional CGI variables available to the script
WWW_FIRST == "one"
WWW_SECOND == "Two (and Three)"
Multiple script=params set against the one request override previous settings unless the parameters are specified with a leading plus symbol, as in set /cgi-bin/example* script=params=+(third=three,fourth="number 4")
|
[NO]SCRIPT=PATH=FIND | directs the server to check for and report if the file specified in the path does not exist before activating the script process. Normally this would be left up to the script. |
[NO]SCRIPT=QUERY=NONE | saves a small amount of overhead by suppressing the decomposition of any query string into key or form fields for those environments that do this for themselves. |
[NO]SCRIPT=QUERY=RELAXED | normally when the CGI variables are being prepared for a script and the query string is parsed an error is reported if it uses x-www-form-urlencoded format and the encoding contains an error. However some scripts use non-strict encodings and this rule allows those scripts to receive the query strings without the server complaining first. |
[NO]SCRIPT=SYNTAX=UNIX | provides the SCRIPT_FILENAME and PATH_TRANSLATED CGI variables in Unix file-system syntax rather than VMS file-system syntax (i.e. /DEVICE/dir1/dir2/file.type rather than DEVICE:[DIR1.DIR2]FILE.TYPE). |
[NO]SCRIPT=SYMBOL=TRUNCATE | allows otherwise aborted script processing to continue. Script CGI variables are provided using DCL symbols. With VMS V7.3-2 and later symbol capacity is in excess of 8000 characters. For VMS V7.3-1 and earlier it has a limit of around 1000 characters. If a symbol is too large the server by default aborts the request generating a 500 HTTP status. If the above mapping is made (against the script path) excessive symbol values are truncated and such symbol names placed into a special CGI variable named SERVER_TRUNCATE. |
Rule | Description |
---|---|
[NO]SSI=PRIV | SSI documents cannot contain privileged directives (e.g. <--#exec ... -->) unless owned by SYSTEM ([1,4]) or are in path set as allowing these directives. Use SSI=priv to enable this, NOSSI=priv to disable. Caution: these SSI directives are quite powerful, use great care when allowing any particular document author or authors to use them. |
SSI=EXEC=<string> | where <string> is a comma-separated list of the #dcl parameters
permitted for the path allows fine-grained control of what capabilities are
enabled. The parameter "#" enables SSI on a per-path basis.
ssi=exec=say,show
ssi=exec=#
|
When enabling these variables it is advised to increase the WASD_CONFIG_GLOBAL [BufferSizeDclCommand] and [BufferSizeCgiPlusIn] directives by approximately 2048.
Rule | Description |
---|---|
NOSSLCGI | disables the facility |
SSLCGI=none | disables the facility |
SSLCGI=Apache_mod_SSL | provides Apache mod_ssl style variables |
SSLCGI=Apache_mod_SSL_extens | provides variables representing X509 V3 extensions from the server certificate |
SSLCGI=Apache_mod_SSL_client | provides variables representing X509 V3 extensions from the client certificate |
SSLCGI=Purveyor | provides Purveyor style variables |
Rule | |
---|---|
THROTTLE=n[/u][,n,n,n,hh:mm:ss,hh:mm:ss] | |
THROTTLE=FROM=<n> | |
THROTTLE=USER=<u> | |
THROTTLE=TO=<n> | |
THROTTLE=RESUME=<n> | |
THROTTLE=BUSY=<n> | |
THROTTLE=TIMEOUT=QUEUE=<hh:mm:ss> | |
THROTTLE=TIMEOUT=BUSY=<hh:mm:ss> |
These parallel the respective configuration timeout periods. See 6.2 Alphabetic Listing.
Rule | Description |
---|---|
TIMEOUT=<hh:mm:ss>, <hh:mm:ss>,<hh:mm:ss> | Keep-alive, then no-progress, then output timeouts. |
TIMEOUT=KEEPALIVE= <hh:mm:ss> | Keep idle network connections alive for this long. |
TIMEOUT=NOPROGRESS= <hh:mm:ss> | Terminate connection when no data is transferred to the client for this period. |
TIMEOUT=OUTPUT= <hh:mm:ss> | Terminate connection after this period when no response data has been sent. |
NOTIMEOUT | No timeouts are applied to the request. |
Rule | Description |
---|---|
WEBDAV=[NO]HIDDEN | list (default) or hide U*x hidden files (i.e. those with names beginning with period) |
WEBDAV=[NO]LOCK | allow/apply WebDAV locking to this path |
WEBDAV=[NO]PROFILE | WebDAV access according to SYSUAF profile |
WEBDAV=[NO]PROP | allow/apply WebDAV 'dead' property(ies) to this path |
WEBDAV=[NO]PUT=LOCK | a resource must be locked before a PUT is allowed |
WEBDAV=[NO]READ | WebDAV methods allowed read this tree |
WEBDAV=[NO]SERVER | WebDAV access as server account (best effort) |
WEBDAV=[NO]WINPROP | when NOWINPROP windows properties are ignored and emulated |
WEBDAV=[NO]WRITE | WebDAV methods allowed write to this path (implied read) |
WEBDAV=LOCK=TIMEOUT=DEFAULT= | hh:mm:ss |
WEBDAV=LOCK=TIMEOUT=MAX= | hh:mm:ss |
WEBDAV=META=DIR= | per-path equivalent of global [WebDAVmetaDir] |
Rule | Description |
---|---|
WEBSOCKET=INPUT=integer | Specifies the size of the WEBSOCKET_INPUT mailbox buffer; in bytes. |
WEBSOCKET=OUTPUT=integer | Specifies the size of the WEBSOCKET_OUTPUT mailbox buffer; in bytes. |
Of course, as with all mapping rules, paths containing file types (extensions) may be specified so it is quite easy to apply settings to particular groups of files. Multiple settings may be made against the one path, merely separate set directives from each other with white-space. If a setting string is required to contain white-space enclose the string with single or double quotes, or curly brackets. The following example gives a small selection of potential uses.
Path SETings may appended to any rule that contains both a template and result. This makes it possible to apply path SETings using matching final rules. For example a matching PASS rule does not require a separate, preceding SET rule containing the same path to also apply required SETings. This is more efficient (requiring less pattern matching) and tends to make the rule set less cluttered.
Path mapping is required to get from web-space into file-space, and that mapping is not necessarily one-to-one. That is, /web/doc/ may not be WEB:[DOC] but for example, DKA0:[WEB.DOC] so that mapping would be
Mapping paths in reverse is needed to get something like DKA0:[WEB.DOC]THIS.TXT (that may come from a $SEARCH result) back into the web-space of /web/doc/this.txt. So WASD needs paths that may be mapped using the result back to the template. In simple mappings the one rule can serve both purposes. In some situations explicit, extra rules are needed.
The above example is trivial, and if WASD needs to turn something like DKA0:[DOC]THIS.TXT into a web-space representation (URI) it makes the file-space specification into URI syntax (i.e. /dka0/web/doc/this.txt) and then scans the rules comparing that to result strings in the MAP rules. When one matches, the template component is used to generate a web-space representation - the reverse of what was done when the request was initially being processed.
The non-trivial example is often associated with concealed, search-list devices. For example, the somewhat contrived
In such a case there is a need to add explicit reverse-mapping rules (often immediately following the forward mapping rule for convenience of grouping, but rules are also a little position sensitive so some skill is required) for the purpose of getting the underlying file specifications into a form for web consumption. In the above scenario an example would be
It is not always straight-forward and sometimes a decision is necessary about how the web-space is to be presented to the clients. For instance, while you easily can have multiple web-space views of the one file-space area, it is less straight-forward to have multiple web-space reverse mappings of the one file-space (as normally only the first matching rule will ever be reverse-mapped).
The example mapping rule file for the WASD HTTP server can be viewed.
The result string of these rules may or may not correspond to to a VMS physical file system path. Either way the resulting rule is further processed before passing or failing.
As described in 2.3 Virtual Services, virtual service syntax may be used with mapping rules to selectively apply rules to one specific service. This example provides the essentials of using this syntax. Note that service-specific and service-common rules may be mixed in any order allowing common mappings (e.g. for scripting) to be shared.
The Server Administration page WATCH report provides the capability to view the rule databse as well as rule mapping during actual request processing, using the WATCH facility.
As this has been deprecated for some years now the documentation for this functionality has been removed.
For backward-reference see the "WASD Hypertext Services - Technical Overview" document for release v9.3 or earlier.
The convention for specifying user web areas is "/~username/". The basic idea is that the user's web-available file-space is mapped into the request in place of the tilde and username.
The USER rule maps a VMS user account default device and directory (i.e. home directory) into a request path (10.5.3 USER Rule). That is, the base location for the request is obtained from the VMS systems SYSUAF file. A user's home directory information is cached, to reduce load on the authorization databases. As this information is usually quite static there is no timeout period on such information (although it may be flushed to make room for other user's). Cache contents is include in the Mapping Rules Report and is implicitly flushed when the server's rules are reloaded.
The following is a typical usage of the rule.
Note the "/www" subdirectory component. It is stongly recommended that users never be mapped into their top-level, but into a web-specific subdirectory. This effectively "sandboxes" Web access to that subdirectory hierarchy, allowing the user privacy elsewhere in the home area.
To accomodate request user paths that do not incorporate a trailing delimiter after the username the following redirect may be used to cause the browser to re-request with a more appropriate path (make sure it follows the USER rule).
WASD also "reverse maps" VMS specifications into paths and so requires additional rules to provide these mappings. (Reverse mapping is required during directory listings and error reporting.) For the continuing example the following rules would be required (and in the stated order).
Where user home directories are spread over multiple devices (physical or concealed logical) a reverse-mapping rule would be required for each. Consider the following situation, where user directories are distributed across these devices (concealed logicals)
This would require the following mapping rules (in the stated order).
Accounts with a search list as a default device (e.g. SYS$SYSROOT) present particular complications in this schema and should be avoided.
Of course vanilla mapping rules may be used to provide for special cases. For instance, if there is requirement for a particular, privileged account to have a user mapping that could be provided as in the following (rather exagerated) example.
In some situations it may be desirable to allow the average Web user to experiment with or implement scripts. With WASD 7.1 and later, and VMS V6.2 and later, this is possible. Detached scripting must be enabled, the /PERSONA startup qualifier used, and appropriate mapping rules in place. If the SET "script=as=" mapping rule specifies a tilde character then for a user request the mapped SYSUAF username is substituted.
The following example shows the essentials of setting up a user environment where access to a subdirectory in the user's home directory, [.WWW] with script's located in a subdirectory of that, [.WWW.CGI-BIN].
For more detailed information see the "Scripting Overview, Introduction".
As this has been deprecated for some years now the documentation for this functionality has been removed.
For backward-reference see the "WASD Hypertext Services - Technical Overview" document for release v9.3 or earlier.
Cross-site HTTP requests are HTTP requests for resources from a domain different to the domain of the resource making the request. For instance, a resource loaded from domain one (http://domain.example) such as an HTML web page, makes a request for a resource on domain two (http://domain.foo), such as an image, using the img element (http://domain.foo/image.jpg). This occurs very commonly on the web today. Pages load a number of resources in a cross-site manner, including CSS stylesheets, images and scripts, and other resources.
Cross-site HTTP requests initiated from within browser-based applications have been subject to well-known restrictions, for well-understood security reasons. In particular, this meant that an actively processing web application could only make HTTP requests to the domain it was loaded from, and not to other domains. Developers expressed the desire to safely evolve capabilities to make cross-site requests, for better, safer web applications. The Web Applications Working Group within the W3C has recommended the new Cross-Origin Resource Sharing (CORS) mechanism, which provides a way for web servers to support cross-site access controls, which enable secure cross-site data transfers.
This section is not a CORS reference, just the WASD implementation. Readers are referred to more authoritative CORS resources.
WASD supports CORS using mapping rules. This means cross-origin requests are evaluated prior to accessing any resources or activating any scripts, etc. If the request has an "Origin: .." header and the path has been set cors=origin=.. the server performs preflighted and request checks. If CORS authorised adds CORS response headers. If not CORS authorised adds nothing. Some significant understanding of the purpose and operation of CORS is required to tailor the provision of the required response headers.
Rule | Description |
---|---|
CORS=AGE=integer seconds | Access-Control-Max-Age: response header |
CORS=CRED=true|false | Access-Control-Allow-Credentials: response header |
CORS=EXPOSE=header[,header2,header3] | Access-Control-Expose-Headers: response header |
CORS=HEADERS= | Access-Control-Allow-Headers: response header |
CORS=METHODS=method[,method2,method3] | Access-Control-Allow-Methods: response header |
CORS=ORIGIN=URL | Access-Control-Allow-Origin: response header |
For a request containing
For a request containing
11.1SYSUAF/Identifier Authentication |
11.2Other Authentication |
11.3Read and Write Groupings |
11.4Considerations |
↩︎ | ↖︎ | ↑︎ | ↘︎ | ↪︎ |
WASD offers a comprehensive and versatile authentication and authorization environment. A little too comprehensive, often leaving the new administrator wondering where to begin. The role of this chapter is to provide a starting place, especially for sources of authentication, along with some basic configurations. Authentication and Authorization of WASD Features and Facilities contains a detailed explanation of all aspects. All examples here assume a standard installation and environment.
Just to clarify. Authentication is the verification of a user's identity, usually through username/password credentials. Authorization is allowing a certain action to be applied to a particular path based on that identity.
Changes to the authorization configuration file can be validated at the command-line before reload or restart. This detects and reports any syntactical and configuration errors but of course cannot check the intent of the rules.
If additional server startup qualifiers are required to enable specific authorization features then these must also be provided when checking. For example:
A server's currently loaded authorization rules may also be interrogated from the Server Administration menu (see Server Administration of WASD Features and Facilities).
This setup allows any active account to authenticate using the local VMS username and password. By default not every account may authenticate this way, only those holding specified VMS rights identifiers. The examples provided in this section allows access to the WASD online Server Administration facility, and so may be followed specifically for that purpose, as well as serve as a general guide.
Of course the one account may hold multiple identifiers and so may have access to various areas.
Using VMS rights identifiers allows significant granularity in providing access.
If the WASD_CONFIG_AUTH configuration file is changed, or rights identifiers are granted or revoked from accounts, the server should be directed to reload the file and purge any cached authorization information.
Other sources of authentication are available, either by themselves or used in the same configuration file (different realms and paths) as those already discussed (Authentication Sources of WASD Features and Facilities). Non-SYSUAF sources do not require any startup qualifier to be enabled.
This is a very simple arrangement, with little inherent security. Lists are more useful when grouping names together for specifying which group may do what to where.
These databases may be administered using the online Server Administration facility (HTTPd Server Revise of WASD Features and Facilities). or the HTAdmin command-line utility (HTAdmin of WASD Features and Facilities). are quite secure and versatile.
WASD allows separate sources for groups of usernames to control read and write access in a particular realm (Realm, Full-Access, Read-Only of WASD Features and Facilities).
These groups may be provided via simple lists, VMS identifiers, HTA databases and authorization agents. The following example shows an identifier authenticated realm with full and read-only access controlled by two simple lists. For the first path the world has no access, for the second read-only access (with the read-only grouping becoming basically redundant information).
Multiple authentication sources (realms) may be configured in the one WASD_CONFIG_AUTH file.
Multiple paths may be mapped against a single authentication source.
Any path may be mapped only once (for any single virtual service).
Paths may have additional access restrictions placed on them, including client host name, username, etc. (Access Restriction Keywords of WASD Features and Facilities).
The configuration file is loaded and stored by the server at startup. If changed it must be reloaded to take effect. This can be done manually using
Authentication information is cached. Access subsequently removed or modified will not take effect until the entry expires, or is manually purged using
Failed attempts to authenticate against a particular source are limited. When this is exceeded access is always denied. If this has happened the cache must be manually purged before a user can successfully authenticate
↩︎ | ↖︎ | ↑︎ | ↘︎ | ↪︎ |
↩︎ | ↖︎ | ↑︎ | ↘︎ | ↪︎ |
Licensed under the GNU Public License, Version 3;
This product includes software developed by the Apache Group for use in the Apache HTTP server project (http://www.apache.org/).
This package uses the Expat XML parsing toolkit.
This package uses essential algorithm and code from Flexible and Economical UTF-8 Decoder.
This package contains software made available by the Free Software Foundation under the GNU General Public License.
This package contains software provided with the OSU (DECthreads) HTTP server package, authored by David Jones:
This product can include software developed by the OpenSSL Project for use in the OpenSSL Toolkit (https://www.openssl.org/).
This package uses SHA-1 hash code.
This software contains code derived in part from RSA Data Security, Inc:
SortTable version 2
Stuart Langridge, http://www.kryogenix.org/code/browser/sorttable/
nghttp2 - HTTP/2 C Library
Tatsuhiro Tsujikawa, https://github.com/tatsuhiro-t
VSI OpenVMS is a registered trademark of VMS Software Inc.
OpenVMS,
HP TCP/IP Services for OpenVMS,
HP C,
Alpha,
Itanium and
VAX
are registered trademarks of Hewlett Packard Enterprise
MultiNet and TCPware are registered trademarks of Process Software Corporation
↩︎ | ↖︎ | ↑︎ | ↘︎ | ↪︎ |