HTTP Server

The Powernet HTTP server is a multi-threaded server that can accept multiple connections limited only by available heap space. For details of the server architecture see Servers.fth.

Web pages may be served from memory or from the FAT file system. If you are using the FAT file system, you can configure the root directory for web pages and the name of the home page. If you do not specify them, they will default to \PAGES and /home.htm.


create pagedir$  ", \HTTP"      \ -- caddr
\ *G The base directory for pages as a counted string.
\ ** The directory name must not end in a separator.
create HomePage$  ", /home.asp" \ -- caddr
\ *G Holds the counted string for the home page.

HTTP specific data

These data definitions are required by each HTTP task. The data is allocated at the start of the task and released when the task is TERMINATEd. The chain SVCHAIN links all the service tasks.

For details of QVARs see the CGI section below.

#80 equ HTTPPort#       \ -- n ; standard is 80
Define the port used for the HTTP server. The standard port is 80. Moved to PNconfig.fth.

#8 equ #QVARS   \ -- n
Number of QVARs in each connection and the common area.

#20 equ /QvarName       \ -- n
Length including count byte of a QVAR's name.

#20 equ /QvarData       \ -- n
Length including count byte of a QVAR's data area.

struct /QVarRec         \ -- n
Structure of a single QVAR.

#QVARS /QVarRec * equ /QVARS    \ -- n
Size of a QVAR buffer.

struct /HTTPdata        \ -- n
The standard service data structure is extended for HTTP.

0 equ NO_SCRIPT         \ -- 0
No selected script language identifier.

-1 equ FORTH_SCRIPT     \ -- -1
Selected script language identifier for Forth.

$80000000 equ WEBS_CLOSE        \ -- mask
Status bit to close the connection.

$40000000 equ WEBS_SCRIPTERR    \ -- mask
Status bit for a script error.

$00000100 equ WEBS_ASP          \ -- mask
Status bit for ASP processing.

$00000002 equ WEBS_KEEP_ALIVE
Status bit for "keep alive" handling.

0 equ PLAIN_TYPE        \ -- 0
The data type indicator for PLAIN data.

1 equ GIF_TYPE          \ -- 1
The data type indicator for GIF data.

2 equ HTML_TYPE         \ -- 2
The data type indicator for HTML data.

3 equ ASP_TYPE          \ -- 3
The data type indicator for ASP data - HTML data with server side scripting. Output is not buffered.

4 equ JPEG_TYPE         \ -- 4
The data type indicator for JPEG data.

5 equ XML_TYPE          \ -- 5
The data type indicator for XML data.

6 equ ASMX_TYPE         \ -- 6
The data type indicator for ASMX data in web services - XML data with server side scripting.

7 equ ASPX_TYPE         \ -- 7
The data type indicator for ASP data - XML data with server side scripting. Output is buffered.

8 equ CSS_TYPE          \ -- 8
The data type indicator for CSS data.

: WebErrCode    \ -- addr
Return the address of the service's error code.

: WebStatus     \ -- addr
Return the address of the service's status flags.

: WebVars       \ -- addr
Return the address of the service's connection variables.

: WebScriptId   \ -- addr
Return the address of the service's script identifier.

: WebOpFlags    \ -- addr
Return the address of the service's operation flags.

bit0

set if primary headers have been read.

: WebPageDef    \ -- struct
Return the address of the file page definition structure. This word is forward referenced.

: TestWebFlag   \ mask -- n|0
Test the mask bits in the service's status flag cell.

: SetWebFlag    \ mask --
Set the mask bits in the service's status flag cell.

: ClrWebFlag    \ mask --
Clear the mask bits in the service's status flag cell.

: SetHttpTimeout        \ time --
Set the timeout target for HTTP. A value of 0 indicates no timing.

: GetHttpTimeout        \ -- time
Get the timeout target for HTTP. A value of 0 indicates no timing.

HTTP vectored I/O

Stream socket

The HTTP server establishes its own generic I/O based on that in SERVERS.FTH in order to handle special character processing in the future.

create ConsoleHTTP      \ -- addr ; OUT managed by upper driver
Function despatch table for HTTP I/O. OUT is managed by the upper level driver.

: HTTPio        \ --
Select HTTP as the console.

: Init-ConsoleHTTP      \ --
Initialise for console I/O by HTTP. Note that the HTTP must have been set up and and the private service area initialised.

Output to a memory buffer

For correct handling of error responses, some browsers require the HTTP "Content-Length" field to be defined. This means that the output length must be known before sending. Consequently, error messages are buffered before transmission.

N.B. All output to the buffer is unchecked for overflow. Checking is the responsibility of the application.

cell +user my_opbuff    \ -- addr
Holds the current output buffer, the first cell holding the buffer length.

: BuffEmit      \ char --
Send a character to the buffer.

: BuffType      \ caddr len --
Send a string to the buffer.

: BuffCr        \ --
Send a CR/LF pair to the buffer.

create ConsoleBuff      \ -- addr ; OUT managed by upper driver
Function despatch table for HTTP buffered I/O. OUT is managed by the upper level driver.

: Buff$         \ -- addr len
Return the address and length of the HTTP output buffer.

: Init-ConsoleBuff      \ len -- addr|0
Initialise for console I/O by the HTTP output buffer. Len is the required size of the buffer and the address of the buffer is returned for success, or zero is returned if the buffer could not be allocated. The first cell of the buffer contains the length used, the rest is for data.

: Term-ConsoleBuff      \ --
Terminate buffer I/O by freeing the buffer.

Diagnostic control

1 value httpDiags?      \ -- n
Set this non-zero to get diagnostic information.

: [hd   httpDiags? if consoleio decimal  ;
A COMPILER macro used to surround debug code, and terminated by HD].


  [HD  ." debug message"  HD]

: hd]   HttpIo  endif  ;
Terminates a [HD ... HD] structure.

: .hdLine       \ caddr len --
Display text with leading CR on Forth console. If httpDiags? is set to zero, no action is taken.

Transmit Utilities

PDATA_MAX equ WEB_SIZE  \ -- n
Maximum web content sent in one packet.

: WebSend       \ caddr len --
Send an arbitrary sized data block to the HTTP client.

CGI Support

Common Gateway Interface (CGI) defines how data is passed between a browser (client) and a server. The equivalent of a variable is a name/value pair (e.g. submit=send). Such pairs are sent by the browser after filling in a form.

CGI variables are "URLencoded" as key/value pairs of the form:

  key1=value1&key2=value2&...

When a form response is sent with a GET message, the CGI variables are sent after the URI separated from it by a '?' character. When a form response is sent with a POST message, the CGI variables appear in the body of the message.

When a GET message is used, PowerNet receives the CGI variables before it knows what to do with them. They are saved in structures called QVARs, which hold counted strings. Numeric values are converted from text by the appropriate Forth words. Name comparision is case-insensitive.

PowerNet manages two sets of these variables. The first is a common set, which may be used to hold system data such as the host's IP address. The second set is allocated per connection as part of the service data, and only exists for the duration of the connection.

#20 equ /QvarName       \ -- n
Length including count byte of a QVAR's name.

#20 equ /QvarData       \ -- n
Length including count byte of a QVAR's data area.

struct /QVarRec         \ -- n
Structure of a single QVAR.

#32 equ #QVARS  \ -- n
Number of QVARs in each connection and the common area.

#QVARS /QVarRec * equ /QVARS    \ -- n
Size of a QVAR buffer.

/QVARS buffer: commonQvars      \ -- addr
The buffer area for the common QVARs.

$80000000 equ NO_VAR_SET        \ -- n
Indicator returned when a QVAR has not been set.

/QvarName 1 chars - equ /QV.name        \ -- n
Maximum length of a QVAR's name.

/QvarData 1 chars - equ /QV.data        \ -- n
Maximum length of a QVAR's string data.

: (.Qvars)      \ addr --
Display the variables in the given table.

: .Qvars                \ --
Display the common and connection variables.

: (freeqvar)    \ table -- addr|0
Find a free QVAR in the given table, and return its address or zero if no free space is available.

: freeQvar      \ -- addr|0
Find a free QVAR for the current connection, and return its address or zero if no free space is available.

: freeCommonQvar        \ -- addr|0
Find a free QVAR in the common QVARs, and return its address or zero if no free space is available.

: (findQvar)    \ caddr len table -- addr|0
Try to find a QVAR in the given table. Case insensitive.

: findQvar      \ caddr len -- addr|0
Try to find the given QVAR name, returning the address if found or zero if not found. Case insensitive.

: %xx>char      \ caddr len -- caddr' len' char
Convert the three character sequence "%xy" as a hexadecimal two-digit number. Step over the string.

: decodeURL$    \ caddr len dest dlen --
Converts the source string caddr/len from a URL-encoded string to a decoded counted string in the buffer dest/dlen. No error checking is performed. Decoding converts '+' characters to a space and "%ab" ('%' follwed by two hex digits) sequences to their character codes.

: setQvarData   \ caddr len qvar --
Use the given URL encoded string *\i{caddr/len) to set the data area of the given qvar.

: setQstring    \ name nlen string slen --
Set the connection QVAR name/nlen to contain string/slen. If the name already exists in the connection or common QVARs it is overwritten. If the QVAR does not exist it is created in the connection's QVAR set. If there is no space for it, the request is ignored.

: setCommonQstring      \ name nlen string slen --
Set the common QVAR name/nlen to contain string/slen. If the QVAR does not exist it is created. If there is no space for it, the request is ignored.

: GetQstring    \ name nlen -- caddr len
Return the text for a string. If the variable cannot be found, "???" is returned.

: .Qstring      \ name nlen --
Output the text for a QVAR using GetQstring above.

: websetvars    \ caddr len --
Process a query string. A query string has the form:

name=value&name=value...

WEBSETVARS can be used with any query string.

: WebQueryVars  \ caddr len --
This is used by GET with query packets.

: init-CommonQvars      \ --
Initialise the common QVARs.

: init-WebVars  \ --
Initialise the connection QVARs.

Numeric QVARS

: setQvar       \ caddr len n --
Set the given QVAR to n, which is held as a signed decimal string.

: getQvar       \ caddr len -- n
Return the value held in the QVAR as a signed decimal number. If the QVAR does not exist NO_VAR_SET is returned. If the string cannot be converted, zero is returned.

CEM specific numeric items are commented out.

: QvarString?   \ caddr len -- type
Return true if the variable is a string variable.

ASP Support

ASP stands for "Active Server Pages". In PowerNet, these are HTML pages which are modified by PowerNet when served. See Examples\PowerNet\TestPages\thanks.asp for an example.

The script language is Forth itself. Inside an HTML document, scripting commands (Forth source) are contained inside tags of the form:

 <% Forth_code %>

It is important that the script delimiters <% and %> are surrounded by white space, otherwise the very simple parser will fail. Before any scripting can be performed, the first command must be on one line:

  <% language=forthscript %>

After that, Forth source code can be interpreted. Note that any CGI variables (QVARS above) are available. For example, if a form was submitted with a GET request:

  GET form1.asp?sname=Robert&send=submit

You can display the data using a script such as:

  <% s" sname" .qstring %>

Pages can be served from a linear memory image, or from the FAT file system. When scripts are served from memory, each script section must be on a single line. When serving scripts from files, the script section can extend over several lines.

: script_code   \ -- caddr len
Returns the command to select Forth as the scripting language.

: asp_header$   \ -- caddr len
ASP command header string.

: asp_tail$     \ -- caddr len
ASP command tail string.

: ScriptEngine  \ caddr len --
Processes the string as Forth source. The data stack is checked on return to ensure system integrity.

: AspProcess    \ caddr len --
If Forth has been selected as the script language, pass the string to SCRIPTENGINE, otherwise try to find the Forth script command. In practice, this means that the language selection command must be on its own as the first script section.

: AspRequest    \ caddr len --
Extract script section and try to process it. All output is done using TYPE and WebSend.

Header scanning

HTTP headers are processed by building a list of actions and strings. The action is an xt and the string is theheader text that we are interested in. Each action has the stack effect

  caddr len --

where caddr/len is the string after the header and the trailing colon.


create UploadHdrs       \ -- addr
  ' doCLength ,  ," Content-Length" align
  ' doCType1 ,   ," Content-Type" align
  0 , 0 ,

: CheckName     \ src slen name nlen -- flag
Return true if the start of the string src/slen contains the entire name name/nlen.

: CheckHeader   \ src slen name nlen -- flag
Return true if the start of the string src/slen contains the entire name name/nlen and a trailing semi-colon.

: doHeader      \ caddr len list --
Given a string, check it against the given list of headers.

: ProcessHeaders        \ list --
We have already received the GET/POST line. Process each header line and finish at the first blank line, ready for the message body. This can be used for the header blocks in forms as well as the for the first header.

: doCLength     \ caddr len --
Process the string to extract the content length data, a decimal number. A valid results is saved in the HTTP service data's httpClength field.

variable FormType       \ -- addr
Holds the status extracted from the first Content-Type header. This set is true if the header includes "multipart/form-data".

#80 buffer: Boundary$   \ -- addr
Holds the boundary string.

: nextChar      \ caddr len -- caddr' len' char
Get the next character from the string and step on.

: ?MultiPart    \ caddr len --
Check for the multipart/form-data field.

: ExtractValue  \ caddr len dest --
Given a string starting after the name portion of a name/value pair, e.g. name="text", save the value text without any quote marks at dest. If there is no value text, the destination is left unchanged.

: ?Boundary     \ caddr len --
Check for the boundary=xxx field.

: doCType1      \ caddr len --
Process the string to extract the content type data, which is the multipart form item and the boundary string.

create BaseHdrs \ -- addr
Contains a list of the basic header fields to process.

BaseHdrs value DefHdrs  \ -- addr
Holds the default header list for file and memory pages.

: ReadHeaders   \ --
Process the headers after the HTTP command line up to the first blank line. Do not use this word for reading part headers. If the response headers have already been read, this word is a no-op.

Form body processing

Words are provided for use with form results submitted as a POST request to an ASP page. You can use these with simple URL encoded or multipart forms. An example of multipart form handling is the file upload application in Examples\WebPost.fth. This example includes boundary handling and parsing code.

Tools

Correct operation of this code requires a Content-Length header in the request. In many cases, the body of a POST request is a very long line. Because this server is designed for systems with limited RAM, key/value pairs are not read as a single line, but are read and parsed character by character. Keys are limited to 31 characters, and values to /SVtib - 32 characters. The service's TIB is where they are buffered.

: BodyLeft      \ -- addr
Returns the address holding the remaining size of the HTTP body. If no Content-Length field has been found, ReadHeaders will have set this to -1.

: -BodyLeft     \ n --
Reduce the remaining content by n.

: CGIname$      \ -- caddr
Buffer for name assembly.

: CGIval$       \ -- caddr
Buffer for data assembly.

: CGIend?       \ -- flag
Return true if the body content is exhausted.

: CGIkey        \ -- char
Read the next character. If the content is exhausted or there has been a socket error, LF is returned.

: ReadKey       \ -- ior
Read the key portion of a key/value pair, returning zero for success.

: ReadValue     \ -- ior
Read the value portion of a key/value pair, returning zero for success.

Application words

These words will mostly be used inside ASP scripts. Note that the key and value strings are not URL decoded. You can use decodeURL$ to perform the decode and copy in one operation.

It is assumed that you control both the RAM usage of the PowerNet server and that you control the scripts it serves. We are not trying to reimplement the Apache server!

To read the pairs, call ReadNextPair and inspect the return result. If it is good, examine and process the PairName and PairValue strings. Because you control the scripts, it is sensible to use key names that do not require URL decoding. Especially for user input, some key values may need decoding, e.g. email addresses. This decision is left up to you.

: ReadNextPair  \ -- ior
Read the next key/value pair and return 0 on success. -1 is returned for an unexpected end of input or socket error, and -2 is returned if input was truncated before the key terminator.

: PairName      \ -- caddr len
Return the last key/value pair's key text.

: PairValue     \ -- caddr len
Return the last key/value pair's value text.

: DumpPairs     \ --
A diagnostic tool for use in scripts. Read and dump all the key/value pairs without preserving them.

HTTP headers and responses

This section deals with extracting the HTTP command data and with pages served from memory.

: web-file-name \ caddr len -- caddr len'
Return the string up to the next '?' character. Then, if the file name starts "HTTP://.../" step over "HTTP://...". This situation can occur with proxy servers.

: test-web-type \ caddr len -- type
Return the type code associated with the file extension, e.g ".htm" or ".asp".

: .HTTP#        \ n --
Output the HTTP string and code line.

: .HTTPserver   \ --
Output the HTTP server name line.

: .HTTPdate     \ --
Output an HTTP date line.

: .HTTPar-none  \ --
Output the "accept-ranges: none" line.

: .HTTPcontent  \ type --
Output the text string for the curent type. Unknown types generate "text/plain".

: .close/keep   \ --
Output the default connection type.

: WebResponse   \ datalen type code --
Output a suitable HTTP header for this type of data.

create Null$    \ -- addr
A null string which may be used as a counted string or a zero-terminated string.

: .ErrHead      \ len err# --
Output the error header. If len is non-zero, it is used for a "Content-Length" field.

: .ErrBody      \ caddr1 len1 caddr2 len2 err# --
Output a web error body using the given strings and error number. The string caddr2/len2 contains the error description, e.g. "Page not found". If len2 is zero no error text body is sent and caddr1/len1 is discarded. The string caddr1/len1 is displayed if len1 is non-zero, and is usually the resource name that caused the problem.

: weberror      \ caddr1 len1 caddr2 len2 err# --
Display a web error message using the given strings and error number. The string caddr2/len2 contains the error description, e.g. "Page not found". If len2 is zero no error text body is sent and caddr1/len1 is discarded. The string caddr1/len1 is displayed if len1 is non-zero, and is usually the resource name that caused the problem.

create HomePage$  ", /home.htm"
If not already defined, HomePage$ is set to contain the counted string /home.htm.

: CheckHomePage \ caddr len -- caddr' len'
Check for a home page string of the forms "/ " or "/" and if found replace with "/home.htm".

: ServeSMem     \ caddr len type --
Serve a page from memory, according to its page type extracted from the page name.

Serving files

#256 equ /FileUnit      \ -- len
The size of a part of a file processed at once.

: FileQuery     \ -- ; fetch line into TIB
Reset the input source specification to the console and accept a line of text into the input buffer.

: AspInterpret  \ --
Process the current input line as if it is text entered at the keyboard.

: InterpScript  \ --
Interpret a section of ForthScript which may extend over several lines.

: sendPrev      \ addr --
Send the source line before this address.

semaphore InterpSem     \ -- addr
Exclusive access semaphore for the Forth interpreter.

: FileScript    \ --
Read and process a file with server side scripting until EOF or an error. The file is already open and is not closed.

: AspFileReq    \ len --
Read and process a file of length len with server side scripting. The file is already open. All input is done using the FileCon Generic I/O device (see FATcore.fth. All output is done using TYPE and WebSend.

: badFS         \ x y z --
Discard three items and set error flags.

: (FileSend)    \ len buff --
Serve the file using the given buffer.

: FileSend      \ len --
Serve a plain file. without scripting. The file is open and the details are in the given /PageDef structure.

: ServeFile     \ len type --
Serve a page from a file, according to its page type extracted from the page name. The request headers are read if they have not already been read.

HTTP service task

: ?CloseHTTP    \ --
Run after processing input to see if the connection should be closed.

: Serve404      \ qaddr qlen --
Serve a "not found" page.

: ServePage     \ qaddr qlen --
Serve a page defined by the string qaddr/qlen which may include CGI vars. The string is the command line with the command removed.

: http-cmd      \ caddr len -- caddr' len'
Deal with the command line after the command has been recognised. The input is the complete line. The output is the line without the command.

: http-get      \ caddr len --
Process a GET command. The input string is the complete input line containing the GET command.

semaphore PostSem       \ -- addr
Controls access to the single POST handler. POST requests are serialised because they may change the state of the system.

: http-post     \ caddr len --
Process a POST command. The input string is the complete input line containing the POST command.

: ParseHTTP     \ caddr len --
Process received HTTP command line. At present we only deal with GET and POST commands.

: doHTTPinput   \ --
Process any pending input.

: cleanHTTP     \ *sv -- *sv
Cleans up the HTTP system when a task is shut down from the kill chain.

: HTTPService   \ --
The HTTP service action or task launched for each established HTTP connection.

HTTP listening task

Listening tasks or actions are spawned when the HTTP server gets a connection.

: HTTPServer    \ -- ; stay here forever
The HTTP listening task.

0 value HTTPtask        \ -- 0|task
Returns 0 or the HTTP server task if running.

: RunHTTPtask   \ --
Start the HTTP server task.

: StopHTTPtask  \ --
Stop the HTTP server.

Notes on memory usage

The majority of page requests are made using a GET request. For these, only the first line of the header needs to be scanned and the request is contained in a single packet whose maximum size is defined in lower layers of the Powernet system. Consequently a fixed size packet buffer is used for HTTP input.

When handling POST requests, e.g. for Web Services, the input data is not in the headers, but is contained in the body of the message. The size of this body is defined by the Content-Length header. When handling POST messages, the whole of the incoming message must be read, the body extracted and passed to the message handler. In addition, some handlers may need to process the message header. For example, Web Services need to use the SOAPaction header before SOAP version 1.2.

Scripting for output messages introduces the problem that the size of the message body is not (in general) known until the script output has been generated. This means that the Content-Length header (sent before the body) cannot be formed until the body has been generated.

To avoid the memory overhead of buffering script output, for ASP file requests no Content-Length header is generated and the connection is closed after the response has been sent to indicate that the message is complete. For web services, ASPX pages are served and are buffered, and a Content-Length header is generated.

Error messages such as "404 Not found" are always buffered to produce a valid Content-Length header because some browsers require this header for error messages.

The Content-Length header is not always required for HTTP version 1.1 and above. However, if HTTP 1.0 clients have to be supported the Content-Length header must be provided. In this case all scripted operations must be buffered. See RFC2616 for more detils.

Authentication

Before PowerNet v4.4, authentication was provided by a deferred word. Nobody reported using it, so it has been removed to save memory. This section shows how to put it back if required. All the code to be added can be found towards the end of HTTP.FTH.

More flexibility is provided in v4.4 because it is much easier to parse headers (see Examples\WebPost.fth), and scripting provides more choice as to which files are secured and which are public.

Add the line below to the /HTTPdata structure.

  int httpLogin                    \ login value

Add the following definitions.

: WebLogin      \ -- addr
Return the address of the service's login result.

defer WebAuthenticate   \ caddr len -- res|0
Given a GET string, returns a non-zero code for permission to carry on. If permission is refused, zero is returned.

: (WebAuthenticate)     \ caddr len -- res|0
The default action of WebAuthenticate always returns true.

Restore the following code to HTTP-CMD.

  2dup WebAuthenticate ?dup 0= if
    2drop  Null$ 0 s" Invalid login " #401 weberror  exit
  endif
  WebLogin !                            \ stash good login result

create BaseHdrs \ -- addr
Contains a list of the basic header fields to process.