The Powernet HTTP server is a multi-threaded server that can accept multiple connections limited only by available heap space. For details of the server architecture see Servers.fth.
Web pages may be served from memory or from the FAT file system. If you are using the FAT file system, you can configure the root directory for web pages and the name of the home page. If you do not specify them, they will default to \PAGES and /home.htm.
create pagedir$ ", \HTTP" \ -- caddr
\ *G The base directory for pages as a counted string.
\ ** The directory name must not end in a separator.
create HomePage$ ", /home.asp" \ -- caddr
\ *G Holds the counted string for the home page.
These data definitions are required by each HTTP task. The data
is allocated at the start of the task and released when the task is
TERMINATE
d. The chain SVCHAIN
links all the service tasks.
For details of QVAR
s see the CGI section below.
#80 equ HTTPPort# \ -- n ; standard is 80
Define the port used for the HTTP server. The standard
port is 80. Moved to PNconfig.fth.
#8 equ #QVARS \ -- n
Number of QVAR
s in each connection and the common area.
#20 equ /QvarName \ -- n
Length including count byte of a QVAR
's name.
#20 equ /QvarData \ -- n
Length including count byte of a QVAR
's data area.
struct /QVarRec \ -- n
Structure of a single QVAR
.
#QVARS /QVarRec * equ /QVARS \ -- n
Size of a QVAR buffer.
struct /HTTPdata \ -- n
The standard service data structure is extended for HTTP.
0 equ NO_SCRIPT \ -- 0
No selected script language identifier.
-1 equ FORTH_SCRIPT \ -- -1
Selected script language identifier for Forth.
$80000000 equ WEBS_CLOSE \ -- mask
Status bit to close the connection.
$40000000 equ WEBS_SCRIPTERR \ -- mask
Status bit for a script error.
$00000100 equ WEBS_ASP \ -- mask
Status bit for ASP processing.
$00000002 equ WEBS_KEEP_ALIVE
Status bit for "keep alive" handling.
0 equ PLAIN_TYPE \ -- 0
The data type indicator for PLAIN data.
1 equ GIF_TYPE \ -- 1
The data type indicator for GIF data.
2 equ HTML_TYPE \ -- 2
The data type indicator for HTML data.
3 equ ASP_TYPE \ -- 3
The data type indicator for ASP data - HTML data with server
side scripting. Output is not buffered.
4 equ JPEG_TYPE \ -- 4
The data type indicator for JPEG data.
5 equ XML_TYPE \ -- 5
The data type indicator for XML data.
6 equ ASMX_TYPE \ -- 6
The data type indicator for ASMX data in web services - XML
data with server side scripting.
7 equ ASPX_TYPE \ -- 7
The data type indicator for ASP data - XML data with server
side scripting. Output is buffered.
8 equ CSS_TYPE \ -- 8
The data type indicator for CSS data.
: WebErrCode \ -- addr
Return the address of the service's error code.
: WebStatus \ -- addr
Return the address of the service's status flags.
: WebVars \ -- addr
Return the address of the service's connection variables.
: WebScriptId \ -- addr
Return the address of the service's script identifier.
: WebOpFlags \ -- addr
Return the address of the service's operation flags.
bit0 |
set if primary headers have been read. |
: WebPageDef \ -- struct
Return the address of the file page definition structure.
This word is forward referenced.
: TestWebFlag \ mask -- n|0
Test the mask bits in the service's status flag cell.
: SetWebFlag \ mask --
Set the mask bits in the service's status flag cell.
: ClrWebFlag \ mask --
Clear the mask bits in the service's status flag cell.
: SetHttpTimeout \ time --
Set the timeout target for HTTP. A value of 0 indicates
no timing.
: GetHttpTimeout \ -- time
Get the timeout target for HTTP. A value of 0 indicates
no timing.
The HTTP server establishes its own generic I/O based on that in SERVERS.FTH in order to handle special character processing in the future.
create ConsoleHTTP \ -- addr ; OUT managed by upper driver
Function despatch table for HTTP I/O.
OUT is managed by the upper level driver.
: HTTPio \ --
Select HTTP as the console.
: Init-ConsoleHTTP \ --
Initialise for console I/O by HTTP. Note that the
HTTP must have been set up and and the private
service area initialised.
For correct handling of error responses, some browsers require the HTTP "Content-Length" field to be defined. This means that the output length must be known before sending. Consequently, error messages are buffered before transmission.
N.B. All output to the buffer is unchecked for overflow. Checking is the responsibility of the application.
cell +user my_opbuff \ -- addr
Holds the current output buffer, the first cell holding
the buffer length.
: BuffEmit \ char --
Send a character to the buffer.
: BuffType \ caddr len --
Send a string to the buffer.
: BuffCr \ --
Send a CR/LF pair to the buffer.
create ConsoleBuff \ -- addr ; OUT managed by upper driver
Function despatch table for HTTP buffered I/O.
OUT is managed by the upper level driver.
: Buff$ \ -- addr len
Return the address and length of the HTTP output buffer.
: Init-ConsoleBuff \ len -- addr|0
Initialise for console I/O by the HTTP output buffer.
Len is the required size of the buffer and the address
of the buffer is returned for success, or zero is returned
if the buffer could not be allocated. The first cell of
the buffer contains the length used, the rest is for data.
: Term-ConsoleBuff \ --
Terminate buffer I/O by freeing the buffer.
1 value httpDiags? \ -- n
Set this non-zero to get diagnostic information.
: [hd httpDiags? if consoleio decimal ;
A COMPILER
macro used to surround debug code, and
terminated by HD]
.
[HD ." debug message" HD]
: hd] HttpIo endif ;
Terminates a [HD ... HD]
structure.
: .hdLine \ caddr len --
Display text with leading CR on Forth console. If
httpDiags?
is set to zero, no action is taken.
PDATA_MAX equ WEB_SIZE \ -- n
Maximum web content sent in one packet.
: WebSend \ caddr len --
Send an arbitrary sized data block to the HTTP client.
Common Gateway Interface (CGI) defines how data is passed between a browser (client) and a server. The equivalent of a variable is a name/value pair (e.g. submit=send). Such pairs are sent by the browser after filling in a form.
CGI variables are "URLencoded" as key/value pairs of the form:
key1=value1&key2=value2&...
When a form response is sent with a GET message, the CGI variables are sent after the URI separated from it by a '?' character. When a form response is sent with a POST message, the CGI variables appear in the body of the message.
When a GET message is used, PowerNet receives the CGI
variables before it knows what to do with them. They are
saved in structures called QVAR
s, which hold counted
strings. Numeric values are converted from text by the
appropriate Forth words. Name comparision is case-insensitive.
PowerNet manages two sets of these variables. The first is a common set, which may be used to hold system data such as the host's IP address. The second set is allocated per connection as part of the service data, and only exists for the duration of the connection.
#20 equ /QvarName \ -- n
Length including count byte of a QVAR
's name.
#20 equ /QvarData \ -- n
Length including count byte of a QVAR
's data area.
struct /QVarRec \ -- n
Structure of a single QVAR
.
#32 equ #QVARS \ -- n
Number of QVAR
s in each connection and the common area.
#QVARS /QVarRec * equ /QVARS \ -- n
Size of a QVAR
buffer.
/QVARS buffer: commonQvars \ -- addr
The buffer area for the common QVAR
s.
$80000000 equ NO_VAR_SET \ -- n
Indicator returned when a QVAR
has not been set.
/QvarName 1 chars - equ /QV.name \ -- n
Maximum length of a QVAR
's name.
/QvarData 1 chars - equ /QV.data \ -- n
Maximum length of a QVAR
's string data.
: (.Qvars) \ addr --
Display the variables in the given table.
: .Qvars \ --
Display the common and connection variables.
: (freeqvar) \ table -- addr|0
Find a free QVAR
in the given table, and return its address
or zero if no free space is available.
: freeQvar \ -- addr|0
Find a free QVAR
for the current connection, and return its address
or zero if no free space is available.
: freeCommonQvar \ -- addr|0
Find a free QVAR
in the common QVAR
s, and return its address
or zero if no free space is available.
: (findQvar) \ caddr len table -- addr|0
Try to find a QVAR
in the given table. Case insensitive.
: findQvar \ caddr len -- addr|0
Try to find the given QVAR
name, returning the address
if found or zero if not found. Case insensitive.
: %xx>char \ caddr len -- caddr' len' char
Convert the three character sequence "%xy" as a hexadecimal
two-digit number. Step over the string.
: decodeURL$ \ caddr len dest dlen --
Converts the source string caddr/len from a URL-encoded
string to a decoded counted string in the buffer dest/dlen.
No error checking is performed.
Decoding converts '+' characters to a space and
"%ab" ('%' follwed by two hex digits) sequences to their
character codes.
: setQvarData \ caddr len qvar --
Use the given URL encoded string *\i{caddr/len) to set the
data area of the given qvar.
: setQstring \ name nlen string slen --
Set the connection QVAR
name/nlen to contain
string/slen. If the name already exists in the
connection or common QVAR
s it is overwritten. If the
QVAR
does not exist it is created in the connection's
QVAR
set. If there is no space for it, the request is
ignored.
: setCommonQstring \ name nlen string slen --
Set the common QVAR
name/nlen to contain string/slen.
If the QVAR
does not exist it is created. If there is no
space for it, the request is ignored.
: GetQstring \ name nlen -- caddr len
Return the text for a string. If the variable cannot be found,
"???" is returned.
: .Qstring \ name nlen --
Output the text for a QVAR
using GetQstring
above.
: websetvars \ caddr len --
Process a query string. A query string has the form:
name=value&name=value...
WEBSETVARS
can be used with any query string.
: WebQueryVars \ caddr len --
This is used by GET with query packets.
: init-CommonQvars \ --
Initialise the common QVAR
s.
: init-WebVars \ --
Initialise the connection QVAR
s.
: setQvar \ caddr len n --
Set the given QVAR
to n, which is held as a
signed decimal string.
: getQvar \ caddr len -- n
Return the value held in the QVAR
as a signed decimal
number. If the QVAR
does not exist NO_VAR_SET
is
returned. If the string cannot be converted, zero is returned.
CEM specific numeric items are commented out.
: QvarString? \ caddr len -- type
Return true if the variable is a string variable.
ASP stands for "Active Server Pages". In PowerNet, these are HTML pages which are modified by PowerNet when served. See Examples\PowerNet\TestPages\thanks.asp for an example.
The script language is Forth itself. Inside an HTML document, scripting commands (Forth source) are contained inside tags of the form:
<% Forth_code %>
It is important that the script delimiters <%
and %>
are
surrounded by white space, otherwise the very simple parser
will fail. Before any scripting can be performed, the first
command must be on one line:
<% language=forthscript %>
After that, Forth source code can be interpreted. Note that any CGI variables (QVARS above) are available. For example, if a form was submitted with a GET request:
GET form1.asp?sname=Robert&send=submit
You can display the data using a script such as:
<% s" sname" .qstring %>
Pages can be served from a linear memory image, or from the FAT file system. When scripts are served from memory, each script section must be on a single line. When serving scripts from files, the script section can extend over several lines.
: script_code \ -- caddr len
Returns the command to select Forth as the scripting language.
: asp_header$ \ -- caddr len
ASP command header string.
: asp_tail$ \ -- caddr len
ASP command tail string.
: ScriptEngine \ caddr len --
Processes the string as Forth source. The data stack is
checked on return to ensure system integrity.
: AspProcess \ caddr len --
If Forth has been selected as the script language,
pass the string to SCRIPTENGINE
, otherwise try to
find the Forth script command. In practice, this
means that the language selection command must be
on its own as the first script section.
: AspRequest \ caddr len --
Extract script section and try to process it. All output is
done using TYPE
and WebSend
.
HTTP headers are processed by building a list of actions and strings. The action is an xt and the string is theheader text that we are interested in. Each action has the stack effect
caddr len --
where caddr/len is the string after the header and the trailing colon.
create UploadHdrs \ -- addr
' doCLength , ," Content-Length" align
' doCType1 , ," Content-Type" align
0 , 0 ,
: CheckName \ src slen name nlen -- flag
Return true if the start of the string src/slen contains
the entire name name/nlen.
: CheckHeader \ src slen name nlen -- flag
Return true if the start of the string src/slen contains
the entire name name/nlen and a trailing semi-colon.
: doHeader \ caddr len list --
Given a string, check it against the given list of headers.
: ProcessHeaders \ list --
We have already received the GET/POST line. Process each header
line and finish at the first blank line, ready for the message
body. This can be used for the header blocks in forms
as well as the for the first header.
: doCLength \ caddr len --
Process the string to extract the content length data,
a decimal number. A valid results is saved in the HTTP
service data's httpClength
field.
variable FormType \ -- addr
Holds the status extracted from the first Content-Type
header. This set is true if the header includes
"multipart/form-data".
#80 buffer: Boundary$ \ -- addr
Holds the boundary string.
: nextChar \ caddr len -- caddr' len' char
Get the next character from the string and step on.
: ?MultiPart \ caddr len --
Check for the multipart/form-data field.
: ExtractValue \ caddr len dest --
Given a string starting after the name portion of a name/value
pair, e.g. name="text", save the value text without
any quote marks at dest. If there is no value text,
the destination is left unchanged.
: ?Boundary \ caddr len --
Check for the boundary=xxx field.
: doCType1 \ caddr len --
Process the string to extract the content type data,
which is the multipart form item and the boundary string.
create BaseHdrs \ -- addr
Contains a list of the basic header fields to process.
BaseHdrs value DefHdrs \ -- addr
Holds the default header list for file and memory pages.
: ReadHeaders \ --
Process the headers after the HTTP command line up to the
first blank line. Do not use this word for reading part
headers. If the response headers have already been read,
this word is a no-op.
Words are provided for use with form results submitted as a POST request to an ASP page. You can use these with simple URL encoded or multipart forms. An example of multipart form handling is the file upload application in Examples\WebPost.fth. This example includes boundary handling and parsing code.
Correct operation of this code requires a Content-Length
header in the request. In many cases, the body of a POST
request is a very long line. Because this server is designed
for systems with limited RAM, key/value pairs are not
read as a single line, but are read and parsed character by
character. Keys are limited to 31 characters, and values
to /SVtib - 32
characters. The service's TIB
is where they are buffered.
: BodyLeft \ -- addr
Returns the address holding the remaining size of the HTTP
body. If no Content-Length field has been found, ReadHeaders
will have set this to -1.
: -BodyLeft \ n --
Reduce the remaining content by n.
: CGIname$ \ -- caddr
Buffer for name assembly.
: CGIval$ \ -- caddr
Buffer for data assembly.
: CGIend? \ -- flag
Return true if the body content is exhausted.
: CGIkey \ -- char
Read the next character. If the content is exhausted or there
has been a socket error, LF is returned.
: ReadKey \ -- ior
Read the key portion of a key/value pair, returning zero
for success.
: ReadValue \ -- ior
Read the value portion of a key/value pair, returning zero
for success.
These words will mostly be used inside ASP scripts.
Note that the key and value strings are not URL decoded.
You can use decodeURL$
to perform the decode and
copy in one operation.
It is assumed that you control both the RAM usage of the PowerNet server and that you control the scripts it serves. We are not trying to reimplement the Apache server!
To read the pairs, call ReadNextPair
and inspect
the return result. If it is good, examine and process
the PairName
and PairValue
strings. Because
you control the scripts, it is sensible to use key names
that do not require URL decoding. Especially for user input,
some key values may need decoding, e.g. email addresses.
This decision is left up to you.
: ReadNextPair \ -- ior
Read the next key/value pair and return 0 on success.
-1 is returned for an unexpected end of input or socket
error, and -2 is returned if input was truncated before
the key terminator.
: PairName \ -- caddr len
Return the last key/value pair's key text.
: PairValue \ -- caddr len
Return the last key/value pair's value text.
: DumpPairs \ --
A diagnostic tool for use in scripts. Read and dump all
the key/value pairs without preserving them.
This section deals with extracting the HTTP command data and with pages served from memory.
: web-file-name \ caddr len -- caddr len'
Return the string up to the next '?' character. Then, if
the file name starts "HTTP://.../" step over "HTTP://...".
This situation can occur with proxy servers.
: test-web-type \ caddr len -- type
Return the type code associated with the file extension,
e.g ".htm" or ".asp".
: .HTTP# \ n --
Output the HTTP string and code line.
: .HTTPserver \ --
Output the HTTP server name line.
: .HTTPdate \ --
Output an HTTP date line.
: .HTTPar-none \ --
Output the "accept-ranges: none" line.
: .HTTPcontent \ type --
Output the text string for the curent type. Unknown types
generate "text/plain".
: .close/keep \ --
Output the default connection type.
: WebResponse \ datalen type code --
Output a suitable HTTP header for this type of data.
create Null$ \ -- addr
A null string which may be used as a counted
string or a zero-terminated string.
: .ErrHead \ len err# --
Output the error header. If len is non-zero, it is
used for a "Content-Length" field.
: .ErrBody \ caddr1 len1 caddr2 len2 err# --
Output a web error body using the given strings
and error number. The string caddr2/len2 contains the
error description, e.g. "Page not found". If len2 is zero
no error text body is sent and caddr1/len1 is discarded.
The string caddr1/len1 is displayed if len1 is non-zero,
and is usually the resource name that caused the problem.
: weberror \ caddr1 len1 caddr2 len2 err# --
Display a web error message using the given strings
and error number. The string caddr2/len2 contains the
error description, e.g. "Page not found". If len2 is zero
no error text body is sent and caddr1/len1 is discarded.
The string caddr1/len1 is displayed if len1 is non-zero,
and is usually the resource name that caused the problem.
create HomePage$ ", /home.htm"
If not already defined, HomePage$
is set to contain
the counted string /home.htm.
: CheckHomePage \ caddr len -- caddr' len'
Check for a home page string of the forms "/ " or "/" and
if found replace with "/home.htm".
: ServeSMem \ caddr len type --
Serve a page from memory, according to its page type
extracted from the page name.
#256 equ /FileUnit \ -- len
The size of a part of a file processed at once.
: FileQuery \ -- ; fetch line into TIB
Reset the input source specification to the console and
accept a line of text into the input buffer.
: AspInterpret \ --
Process the current input line as if it is text entered at
the keyboard.
: InterpScript \ --
Interpret a section of ForthScript which may extend over
several lines.
: sendPrev \ addr --
Send the source line before this address.
semaphore InterpSem \ -- addr
Exclusive access semaphore for the Forth interpreter.
: FileScript \ --
Read and process a file with server side scripting until EOF
or an error. The file is already open and is not closed.
: AspFileReq \ len --
Read and process a file of length len with server side
scripting. The file is already open. All input is done using
the FileCon
Generic I/O device (see FATcore.fth.
All output is done using TYPE
and WebSend
.
: badFS \ x y z --
Discard three items and set error flags.
: (FileSend) \ len buff --
Serve the file using the given buffer.
: FileSend \ len --
Serve a plain file. without scripting. The file is open and
the details are in the given /PageDef
structure.
: ServeFile \ len type --
Serve a page from a file, according to its page type
extracted from the page name. The request headers are read
if they have not already been read.
: ?CloseHTTP \ --
Run after processing input to see if the connection
should be closed.
: Serve404 \ qaddr qlen --
Serve a "not found" page.
: ServePage \ qaddr qlen --
Serve a page defined by the string qaddr/qlen which
may include CGI vars. The string is the command line with
the command removed.
: http-cmd \ caddr len -- caddr' len'
Deal with the command line after the command has been
recognised. The input is the complete line. The output
is the line without the command.
: http-get \ caddr len --
Process a GET command. The input string is the
complete input line containing the GET command.
semaphore PostSem \ -- addr
Controls access to the single POST handler. POST requests
are serialised because they may change the state of the
system.
: http-post \ caddr len --
Process a POST command. The input string is the
complete input line containing the POST command.
: ParseHTTP \ caddr len --
Process received HTTP command line. At present we only deal
with GET and POST commands.
: doHTTPinput \ --
Process any pending input.
: cleanHTTP \ *sv -- *sv
Cleans up the HTTP system when a task is shut down from the
kill chain.
: HTTPService \ --
The HTTP service action or task launched for each established
HTTP connection.
Listening tasks or actions are spawned when the HTTP server gets a connection.
: HTTPServer \ -- ; stay here forever
The HTTP listening task.
0 value HTTPtask \ -- 0|task
Returns 0 or the HTTP server task if running.
: RunHTTPtask \ --
Start the HTTP server task.
: StopHTTPtask \ --
Stop the HTTP server.
The majority of page requests are made using a GET request. For these, only the first line of the header needs to be scanned and the request is contained in a single packet whose maximum size is defined in lower layers of the Powernet system. Consequently a fixed size packet buffer is used for HTTP input.
When handling POST requests, e.g. for Web Services, the input data is not in the headers, but is contained in the body of the message. The size of this body is defined by the Content-Length header. When handling POST messages, the whole of the incoming message must be read, the body extracted and passed to the message handler. In addition, some handlers may need to process the message header. For example, Web Services need to use the SOAPaction header before SOAP version 1.2.
Scripting for output messages introduces the problem that the size of the message body is not (in general) known until the script output has been generated. This means that the Content-Length header (sent before the body) cannot be formed until the body has been generated.
To avoid the memory overhead of buffering script output, for ASP file requests no Content-Length header is generated and the connection is closed after the response has been sent to indicate that the message is complete. For web services, ASPX pages are served and are buffered, and a Content-Length header is generated.
Error messages such as "404 Not found" are always buffered to produce a valid Content-Length header because some browsers require this header for error messages.
The Content-Length header is not always required for HTTP version 1.1 and above. However, if HTTP 1.0 clients have to be supported the Content-Length header must be provided. In this case all scripted operations must be buffered. See RFC2616 for more detils.
Before PowerNet v4.4, authentication was provided by a deferred word. Nobody reported using it, so it has been removed to save memory. This section shows how to put it back if required. All the code to be added can be found towards the end of HTTP.FTH.
More flexibility is provided in v4.4 because it is much easier to parse headers (see Examples\WebPost.fth), and scripting provides more choice as to which files are secured and which are public.
Add the line below to the /HTTPdata
structure.
int httpLogin \ login value
Add the following definitions.
: WebLogin \ -- addr
Return the address of the service's login result.
defer WebAuthenticate \ caddr len -- res|0
Given a GET string, returns a non-zero code for permission
to carry on. If permission is refused, zero is returned.
: (WebAuthenticate) \ caddr len -- res|0
The default action of WebAuthenticate
always returns true.
Restore the following code to HTTP-CMD
.
2dup WebAuthenticate ?dup 0= if 2drop Null$ 0 s" Invalid login " #401 weberror exit endif WebLogin ! \ stash good login result
create BaseHdrs \ -- addr
Contains a list of the basic header fields to process.