↩︎ | ↖︎ | ↑︎ | ↘︎ | ↪︎ |
Request processing (WASD_CONFIG_MAP) and authorization (WASD_CONFIG_AUTH) rules may be conditionally applied depending on request, server or other charactersistics. These include
As described in 2.3.1 [[virtual-server]] a [[host:port]] rule applies subsequent configuration depending on whether the request service matches the specified service. This makes it a fundamental element of conditional configuration.
Note that service conditionals impose a boundary on the scope of if..endif constructs. That is, an if..endif may not span a virtual service conditional. A conditional flow syntax error is reported if an if..endif construct is not properly closed before encountering a subsequent [[host:port]] rule.
These may be nested up to a maximum depth of eight, are not case sensitive and generally match via string comparison, although some tests are performed as boolean operations, by converting the conditional parameter to a number before comparison, and IP address parameters will accept a network mask as well as a string pattern.
The basis of much conditional decision making is string pattern matching. Both wildcard and regular expression based pattern matching is available (4. String Matching). Wildcard matching in conditional tests is greedy. Regular expression matching, in common with usage throughout WASD, is differentiated from wildcard patterns using a leading "^" character.
Conditional expressions and processing flow structures may be used in the following formats. Conditional and rule text may be indented for clarifying structure.
Logical operators are also supported, in conjunction with precedence ordering parentheses, allowing moderately complex compound expressions to be applied in conditionals.
! | logical negation |
---|---|
&& | logical AND |
|| | logical OR |
There are two more conditional structures that allow previous decisions to be reused. These are unif and the ifif. The first unconditionally includes rules regardless of the current state of execution. The second resumes execution only if the previous if or elif expression was true. The else statement may also be used after an unif to continue only if the previous expression was false. The purpose of these constructs are to allow a single decision statement to include both conditional and unconditional rules.
Although the server cannot determine the correct intent of an otherwise syntactically correct conditional, if it encounters an unexpected but detectable condition during processing it aborts the request, supplying an appropriate error message.
Flow control errors (e.g. an if not closed by a subsequent endif) abort all rule processing and provide a fatal error report to the client.
The following keywords provide a match between the corresponding request or other value and a string immediately following the delimiting colon. White space or other reserved characters may not be included unless preceded by a backslash. The actual value being used in the conditional matching may be observed using the mapping item of the WATCH facility.
Keyword | Description |
---|---|
accept: | Browser-accepted content types as listed in the "Accept:" request header field. Same string as provided in CGI variable HTTP_ACCEPT. |
accept-charset: | Browser-accepted character sets as listed in the "Accept-Charset:" request header field. CGI variable HTTP_ACCEPT_CHARSET. |
accept-encoding: | Browser-accepted content encoding as listed in the "Accept-Encoding:" request header field. CGI variable HTTP_ACCEPT_ENCODING. |
accept-language: | Browser language preferences as listed in the "Accept-Language:" request header field. CGI variable HTTP_ACCEPT_LANGUAGE. |
authorization: | The raw authorization string from the request header, if any supplied. This could be simply used to test whether it has been supplied or not. |
callout: | Simple boolean value. If a script callout is in progress (see "Scripting Overview, CGI Callouts".) it is true, otherwise false. |
client_connect_gt: | An integer representing the current network connections (those currently being processed plus those currently being "kept alive") for the particular client represented by the current request. If greater than this value returns true, otherwise false. See 2.6 Client Concurrency. |
cluster_member: | If the supplied node name is (perhaps currently) a member of the cluster (if any) the server may be executing on. |
command_line: | The command line qualifiers and parameters used when the server image was activated. |
cookie: | Raw cookie data as the text string provided in "Cookie:" request header field. CGI variable HTTP_COOKIE. |
decnet: | Whether DECnet is active on the system and which version is available. This value will be 0 if not active, 4 if PhaseIV or 5 is PhaseV. |
dict: | Matches the specified dictionary entry. See 5.5.4 WATCH Dictionary. |
directory: | Tests whether the specified directory exists or not. Parameter can be a URI available for mapping by the server or a VMS file-system specification. If no parameter is supplied the request path is mapped to a file-system specification. As this conditional accesses the file-system it can be relatively expensive in terms of server latency. |
document_root: | The DOCUMENT_ROOT CGI variable SET using the map=root=<string> mapping rule. |
file: | Tests whether the specified file exists or not. Parameter can be a URI available for mapping by the server or a VMS file-system specification. If no parameter is supplied the request path is mapped to a file-system specification. The specification can be a directory. As this conditional accesses the file-system it can be relatively expensive in terms of server latency. |
forwarded: | Proxy/gateway host(s) request forwarded by, as specified in request header field "Forwarded:". CGI variable HTTP_FORWARDED. |
host: | The host (and optionally port) specified in request header "Host:" field. This is used by all modern browsers to provide virtual host information to the server. CGI variable HTTP_HOST. |
http2: | Is true if the request is being transported using HTTP/2 |
instance: | Used to check whether a particular, clustered instance of WASD is available. See 5.3.4 Instance: and Robin: Keywords. |
jpi_username: | The account username the server is executing as. |
mapped_path: | The path resulting from mapping (phase 2 if script path involved) from which the path-translated is derived. |
multihome: | Somewhat specialised conditional that becomes non-null when a client used a different IP address to connect to the service than the is bound to. Is set to the IP address the client used and may be matched using wildcard matching or as a network mask. |
note: | Ad hoc information (string) provided by the server administrator using the /DO=NOTE= facility (and online equivalent) that can be used to quickly and easily modify rule processing on a per-system or per-cluster basis. |
notepad: | Information (strings) stored using the SET notepad= mapping rule. See 5.3.1 Notepad: Keyword. |
ods: | Specified as 2 or 5 (Extended File System), or as SRI file name encoding (MultiNet NFS and others) PWK encoding (PATHWORKS 4/5), ADS encoding (Advanced Server / PATHWORKS 6), SMB encoding (Samba - same as ADS). |
pass: | A numeric value, 1 or 2, representing the first or second pass (if a script component was parsed) through the path mapping rules. Will be zero at other times. When the server is reverse-mapping a file specification will be -1. |
path-info: | Path specified in the request line. CGI variable PATH_INFO. |
path-translated: | VMS translation of path-info. Available after rule mapping (i.e. during authorization rule processing). |
proctor: | Simple boolean value. If a proctored script this is true (see Script Proctor in WASD Scripting). |
query-string: | Query string specified in request line. Same information as provided in CGI variable QUERY_STRING. |
rand: | Value from a random number generator. See 5.3.2 Rand: Keyword. |
redirected: | If a request has been internally redirected (10.5.2 REDIRECT Rule) this conditional will be non-zero. Can be used as a boolean or with a digit specified. |
referer: | URL of refering page as provided in "Referer:" request header field. CGI variable HTTP_REFERER. |
regex: | Simple boolean value. If configuration directive [RegEx] is enabled (and hence regular expression string matching, 4. String Matching) this will be true. |
remote-addr: | Client IP address. Same as provided as CGI variable REMOTE_ADDR. As with all IP addresses used for conditional testing this may be wildcard string match or network mask expressed as address/mask-length (see 5.3.7 Host Addresses). A domain (host) name preceded by a question point may be specified (e.g. "?the.host.name"). The corresponding IP address is then looked up and compared to the client. This allows ad hoc host name based rules and is distinct from use of remote-host. Note that DNS lookup can introduce some latency to rule (and request) processing. |
remote-host: | Client host name if name resolution enabled, otherwise the IP address (same as remote-addr). CGI variable REMOTE_HOST. |
request: | Detect the presence of specific or unknown request fields. See 5.3.3 Request: Keyword. |
request-method: | HTTP method ("GET", "POST", etc.) specified in the request line. CGI variable REQUEST_METHOD. |
request-protocol: | Detect the HTTP protocol in use for the request, as "2", "1.1", "1.0" or "0.9". Note that the server-protocol conditional will indicate 1.1 when the request-protocol indicates 2. The server and its applications (scripts) still treat it semantically as HTTP/1.1. |
request-scheme: | Request protocol as "http:" or "https:". CGI variable REQUEST_SCHEME. |
request-uri: | The unescaped request path plus any query-string. CGI variable REQUEST_URI. |
restart: | A numeric value, zero to maximum, representing the number of times path mapping has been SET map=restart. Can be used as a boolean or with a digit specified. |
robin: | Used to check whether a particular, clustered instance of WASD is available and distribute requests to it using a round-robin algorithm. See 5.3.4 Instance: and Robin: Keywords. |
script-name: | After the first pass of rule mapping (script component resolution), or during authorization processing, any script component of the request URI. |
server-addr: | The service IP address. CGI variable SERVER_ADDR. This may be wildcard string match or network mask expressed as address/mask-length. |
server_connect_gt: | An integer representing the current server network connections (those currently being processed plus those currently being "kept alive"). If greater than this value returns true, otherwise false. |
server_process_gt: | An integer representing the current server requests in-progress. If greater than this value returns true, otherwise false. |
server-name: | The (possibly virtual) server name. This may or may not exactly match any string provided via the host keyword. CGI variable SERVER_NAME. |
server-port: | The (possibly virtual) server port number. CGI variable SERVER_PORT. |
server-protocol: | "1.1", "1.0", "0.9" representing the HTTP protocol used by the request. |
server-software: | The server identification string, including the version. For example "HTTPd-WASD/8.0.0 OpenVMS/AXP SSL". CGI variable SERVER_SOFTWARE. |
service: | This is the composite server name plus port as server-name:port. To match an unknown service use "?". |
ssl: | Simple boolean value. If request is via Secure Sockets Layer then this will be true. |
syi_arch_name: | System information; CPU architecture of the server system, "Alpha", "Itanium" or "VAX". |
syi_hw_name: | System information; hardware identification string, for example "AlphaStation 400 4/233". |
syi_nodename: | System information; the node name, for example "KLAATU". |
syi_version: | System information; VMS version string, for example "V7.3". |
tcpip: | A string derived from the UCX$IPC_SHR shareable image. It looks something like this "Compaq TCPIP$IPC_SHR V5.1-15 (11-JAN-2001 02:28:33.95)" and comprises the agent (Compaq, MultiNet, TCPware, unknown), the name of the image, the version and finally the link date. |
time: | Compare to current system time. See 5.3.5 Time: Keyword. |
trnlnm: | Translate a logical name. See 5.3.6 Trnlnm: Keyword. |
upstream-addr: | Client proxy/accelerator IP address, when "SET CLIENT=keyword" has been applied to enable transparent up-stream proxy. Same as provided as CGI variable UPSTREAM_ADDR. As with all IP addresses used for conditional testing this may be wildcard string match or network mask expressed as address/mask-length (see 5.3.7 Host Addresses). |
user-agent: | Browser identification string as provided in "User-Agent:" request header field. CGI variable HTTP_USER_AGENT. |
webdav: | Simple boolean value. If the request has been identified as WebDAV then this is true. Takes an optional parameter, "MSagent", which is true if a Microsoft WebDAV agent has been detected. |
websocket: | Simple boolean value. If a WebSocket protocol upgrade request will be true. |
x-forwarded-for: | Proxied client name or address as provided in "X-Forwarded-For:" request header field. CGI variable HTTP_X_FORWARDED_FOR. |
The request notepad is a string storage area that can be used to store and retrieve ad hoc information during path mapping and subsequent authorization processing. The notepad contents can be changed using the SET notepad=<string> or appended to using SET notepad=+<string> (10.5.5 SET Rule). These contents then can be subsequently detected using the notepad: conditional keyword (or the obsolescent 'NO' mapping conditional) and used to control subsequent mapping or authorization processing.
Notepad information persists across internal redirection processing (10.5.2 REDIRECT Rule) and so may be used when the regenerated request is mapped and authorized. To prevent such information from unexpectedly interfering with internally redirected requests a notepad="" can be used to empty the storage area.
The dictionary facility provides similar and arguably superior functionailtiy. See 5.5.4 WATCH Dictionary. In fact notepad is now implemented as a dictionary entry.
At the commencement of each pass a new pseudo-random number is generated (and therefore remains constant during that pass). The rand: conditional is intended to allow some sort of distribution to be built into a set of rules, where each pass (request) generates a different one. The random conditional accepts two parameters, a modulas number, which is used to modulas the base number, and a comparison number, which is compared to the modulas result.
Hence the following conditional rules
Looks through each of the lines of the request header for the specified request field and/or value. This may be used to detect the presence of specific or unknown (to the server) request fields. When detecting a specified just field the name can be provided
Note that all request fields known to the server have a specific associated conditional keyword (i.e. "user-agent:" for the above example). To determine whether any request fields unknown to the server have been supplied use the request: keyword as in the following example.
Both of these conditionals are designed to allow the redistribution of requests between clustered WASD services. They are WASD-aware and so allow a slightly more tailored distribution than perhaps an IP package round-robin implementation might. Each tests for the current operation of WASD on a particular node (using the DLM) before allowing the selection of that node as a target. This can allow some systems to be shutting down or starting up, or have WASD shutdown for any reason, without requiring any extraordinary procedures to allow for the change in processing environment.
The instance: directive allows testing for a particular cluster member having a WASD instance currently running. This can allow requests to be redirected or reverse-proxied to a particular system with the knowlege that it should be processed (of course there is a small window of uncertainty as events such as system shutdown and startup occur asynchronously). The behaviour of the conditional block is entirely determinate based on which node names have a WASD instance and the order of evaluation. Compare this to a similar construct using the robin: directive, as described below.
This conditional is deployed in two phases. In the first, it contains a comma-separated list of node names (that are expected to have instances of WASD instantiated). In the second, containing a single node name, allowing the selected node to be tested. For example.
If none of the node names specified in the first phase is currently running a WASD instance the rule returns false, otherwise true. If true the above example has conditional block processed with each of the node names successively tested. If NODE1 has a WASD instance executing it returns true and the associated redirect is performed. The same for NODE2 and NODE3. At least one of these would be expected to test true otherwise the outer conditional established during phase one would have been expected to return false.
The robin: conditional allows rules to be applied sequentially against specified members of a cluster that currently have instances of WASD running. This is obviously intended to allow a form of load sharing and/or with redundancy (not balancing, as no evaluation of the selected target's current workload is performed, see below). As with the instance: directive above, there is, of course, a small window of potential uncertainty as events such as system shutdown and startup occur asynchronously and may impact availability between the phase one test and ultimate request distribution.
This conditional is again used in two phases. The first, containing a comma-separated list of node names (that are expected to have instances of WASD instantiated). The second, containing a single node name, allowing the selected node (from phase one) to have a rule applied. For example.
In this case round-robining will be made through four node names. Of course these do not have to represent all the systems in the cluster currently available or having WASD instantiated. The first time the 'robin:' rule containing multiple names is called VAX1 will be selected. The second time ALPHA1, the third ALPHA2, and the fourth IA64A. With the fifth call VAX1 is returned to, the sixth ALPHA1, etc. In addition, the selected nodename is verified to have a instance of WASD currently running (using the DLM and WASD's instance awareness). If it does not, round-robining is applied again until one is found (if none is available the phase one conditional returns false). This is most significant as it ensures that the selected node should be able to respond to a redirected or (reverse-)proxied requested. This is the selection set-up phase.
Then there is the selection application phase. Inside the set-up conditional other conditionals apply the selection made in the first phase (through simple nodename string comparison). The rule, in the above example a redirect, is applied if that was the node selected.
During selection set-up unequal weighting can be applied to the round-robin algorithm by including particular node names more than once.
In the above example, the node ALPHA will be selected twice as often as either of VAX1 and VAX2 (and because of the ordering interleaved with the VAX selections).
The time: conditional allows server behaviour to change according to the time of day, week, or even year. It compares the supplied parameter to the current system time in one of three ways.
The trnlnm: conditional dynamically translates a logical name and uses the value. One mandatory and up to two optional parameters may be supplied.
The logical-name must be supplied; without it false is always returned. If just the logical-name is supplied the conditional returns true if the name exists or false if it does not. The default name-table is LNM$FILE_DEV. When the optional name-table is supplied the lookup is confined to that table. If the optional string-to-match is supplied it is matched against the value of the logical and the result returned.
Host names or addresses can be an alpha-numeric string (if DNS lookup is enabled) or dotted-decimal network address, a slash, then a dotted-decimal mask. For example "131.185.250.0/255.255.255.192". This has a 6 bit subnet. It operates by bitwise-ANDing the client host address with the mask, bitwise-ANDing the network address supplied with the mask, then comparing the two results for equality. Using the above example the host 131.185.250.250 would be accepted, but 131.185.250.50 would be rejected. Equivalent notation for this rule would be "131.185.250.0/26".
The following provides a collection of examples of conditional mapping and authorization rules illustrating the use of wildcard matching, network mask matching and the various formats in which the rules may be blocked.
Of course there are a multitude of possibilities based on this idea!
The per-request dictionary stores key-value string pairs related to request processing. Some entries are generated and used internally by the server and others may be inserted, value changed, removed and tested by the server admin for conditional processing purposes.
The dictionary was initially introduced as an abstraction layer between the significantly different HTTP/2 and HTTP/1.n header semantics and server internal processing. Its utility was then extended into configuration. It is implemented as a standard hash table with collision lists. The small cost in terms of processing is completely offset by its effectiveness.
Dictionary entries may be configured using the SET dict=key=value mapping rule or the DICT key=value meta keyword. These are known as configuration entries. Keys must begin with an alpha-numeric character but otherwise keys and values may contain any printable character, with some needing to be escaped in the text of configuration files. These are some examples of each.
If an existing key is (re-)inserted it overwrites the old value.
An entry can have an empty value.
An entry may be removed from the dictionary by prefixing the key name with an exclamation point.
All configuration entries may be removed by using the exclamation point with an empty key.
As mentioned, the server generates and uses dictionary entries during request processing. There are multiple types of entry, generally insulated from each other for good reason. These entries are also available for conditional testing.
Character | Type | Description |
---|---|---|
~ | configuration | admin managed entry |
$ | internal | server processing |
> | request | request header field |
< | response | response header field |
The "if (dict:expression)" contruct first checks for a configuration entry, then for an request header field entry, then finally for an internal entry (response entries are only available for testing after response processing begins and so not in the search list). It is also possible to test for a key of a specific type by prefixing the key name with the type character. This example shows a request header field being conditionally processed.
It is also possible to set an entry of a specific type by prefixing the key with the type character. For example the following will set a response header field that will be included in the header when returned to the client.
Setting any non-configuration entry should only be undertaken by the literati or the brave.
The value of a dictionary entry can be derived in whole or part from the value of another entry or entries. This uses a somewhat familiar substitution syntax. A contrived example shows an entry being set that transfers back the request user-agent header field as a response header field.
The content of a request's dictionary at significant stages of request processing can be viewed using the [x]Internal item of a WATCH report. See WATCH Facility of WASD Features and Facilities.
A request dictionary WATCH point is similar to the following (end of request processing) example. Note that all of the entry types described above are present in the example, including two configured entries. Note also that two of the internal entries contain embedded line-breaks and empty lines. This is an HTTP/2 request and the expanded (HTTP/1.n style) request_header and response_header entries are due to WATCH items Request [x]Header and Response [x]Header also being checked. They were not required for request processing.
The first three digit number is simply the entry count in order of insertion. The second, either square bracketed or period delimited, is the hash table entry. The square brackets indicate the head of the hash table, the periods down the collision list. The single punctuation character is use to indicate and differentiate the entry type. Then are the key and equate-separated value. The brace enclosed numbers are the length of the key and value respectively.
↩︎ | ↖︎ | ↑︎ | ↘︎ | ↪︎ |