Base Kernel Definitions

This section describes a number of the base kernel definitions available to the system. This wordset includes the vast bulk of the ANS Forth specified words as well as a number of useful additions. Note that further information about some words may be found in the draft ANS specification, accessible from the Help menu.

Glossary Notation

The notation for the glossary definitions found in this manual have two major parts:

The definition line varies depending on the definition type. For instance - a normal Forth word will look like:


: AND              \ n1 n2 -- n3                           6.1.0720

where the left most column describes the word AND and type (colon), the center column describes the stack effect of the word and the far right column will specify the ANS standard's reference ID, an MPE reference ID, Forth200x to indicate that the word is a standards proposal, or this field may be empty.

Main Vocabularies

vocabulary FORTH        \ --
The standard general purpose vocabulary.

vocabulary ROOT         \ --
This vocabulary contains only the words which ensure that you can select other vocabularies.

vocabulary SYSTEM       \ --
A repository for those words which are required internally by the compiler/system but should never appear in user code. SYSTEM words may be changed without notice.

vocabulary ENVIRONMENT  \ --
Storage for ANS ENVIRONMENT stuff.

vocabulary SourceFiles  \ --
Storage for SourceFile descriptions after INCLUDE.

vocabulary substitutions        \ --
Repository for text macros.

vocabulary Externals    \ --
Repository for external library calls.

ASCII Character Constants

Various constants for ASCII characters to aid readability and to provide some insulation between VFX Forth implementations on different operating systems.

$07 constant ABELL      \ -- char
Bell/sound character

$08 constant BSIN       \ -- char
Backspace on input character

$7F constant DELIN      \  -- char
Delete character

$08 constant BSOUT      \ -- char
Backspace on output character

$09 constant ATAB       \ -- char
Tab character

$0D constant ACR        \ -- char
Carriage Return character

$0A constant ALF        \ -- char
Line Feed character

$0C constant FFEED      \ -- char
Form Feed character

$20 constant ABL        \ -- char
Space character

$2E constant ADOT       \ -- char
Dot character

$00 constant AEOL       \ -- char
Generic EOL marker.

#13 constant ANL        \ -- char
Host specific constant for the character returned when you press the Enter key on your keyboard.

create eol$     \ -- addr
A counted and zero terminated string holding the operating system specific end of line sequence as a counted and zero terminated string.

create crlf$    \ -- addr
A counted and zero terminated string holding a CR/LF pair.

System CONSTANTs

Various constants for the internal system.

0 constant FALSE               \ -- 0                          6.2.1485
The well formed flag version for a logical negative.

-1 constant TRUE                \ -- -1                         6.2.2298
The well formed flag version for a logical positive.

ABL constant BL                 \ -- u                          6.1.0770
An internal constant for blank space.

$40 constant C/L                \ -- u
Max chars/line for internal displays under C/LINE.

64 constant #VOCS               \ -- u
Maximum number of Vocabularies in search order.

#VOCS cells constant VSIZE      \ -- u
Size of CONTEXT area for search order.

$200 constant FILETIBSZ         \ -- len
Size of TIB buffer when SOURCE-ID is a file pointer.

#260 constant MAX_PATH          \ -- len
Size of longest file/path name for Windows and DOS. 1024 is used for Linux and OS X.

$00 constant NULL
NULL pointer.

Defined USER Variables

USER variables are the Forth equivalent of Thread Local Storage. They are for task specific information and act as normal variables within their thread scope.

USER variables can be defined by the words USER and +USER. They are defined using an offset from a base address assigned at the start of each task.* Offsets in the USER area below $1000 are reserved for kernel use. The variable NEXTUSER is used by +USER and is initialised to $1000 in the primary build of VFX Forth, with 4k bytes of memory available for application use.

The following USER variables have been declared within the system.

$00 cells user S0               \ -- addr
Initial Base of data stack.

$01 cells user R0               \ -- addr
Initial Base of return stack.

$02 cells user #TIB             \ -- addr ; 6.2.0060
Number of characters currently in TIB.

$04 cells user >IN              \ -- addr ; 6.1.0560
Pointer to next char in input stream.

$05 cells user OUT              \ -- addr
Number of characters output since last CR.

$06 cells user BASE             \ -- addr                               6.1.0750
Numeric Conversion Base.

$07 cells user HLD              \ -- addr
Used during number formatting to point to next character to save.

$08 cells user #L               \ -- addr
Number of cells converted by NUMBER? and friends.

$09 cells user #D               \ -- addr
Number of digits converted by NUMBER? and friends.

$0A cells user DPL              \ -- addr
Position of double number indicator in number text.

$0B cells user 'TIB             \ -- addr
Address of TIB.

$0E cells user OP-HANDLE        \ -- addr
Generic IO output handler structure.

$0F cells user IP-HANDLE        \ -- addr
Generic IO input handler structure.

$10 cells user CURROBJ  \ -- addr
Current Object Pointer for OOP extensions.

$11 cells user 'AbortText       \ -- addr
Pointer to counted string for last ABORT".

$12 cells user $S0              \ -- addr
Initial Base of string stack.

$13 cells user $SP              \ -- addr
Current string stack pointer.

$14 cells user fs0
Initial Base of float stack.

$15 cells user fsp
Current float stack pointer.

$16 cells user line#                    \ SFP001
Current source input line number. Note that this variable does NOT describe the number of lines output, but is reserved to hold the number of lines read from the current source input device. For console devices, LINE# should be set to -1 to indicate that the source cannot be recovered for words such as LOCATE and XREF.

$17 cells user op-line#         \ -- addr
Current output line number, incremented by CR and reset by QUIT.

$18 cells USER ThreadExit?      \ -- addr
Used in the multitasker to indicate that the task/thread should terminate.

$19 cells USER ThreadTCB        \ -- addr
Holds the address of a task/thread's Task Control Block.

$1A cells USER ThreadSync       \ -- addr
User in a task/thread for synchronisation.

user PAD                \ -- addr ;                             6.2.2000
Transient data area. The size of PAD is given by the constant /PAD in the ENVIRONMENT vocabulary.

System Variables and Buffers

Variables

variable c/Line         \ -- addr
Maximum number of chars/line in interpret console.

variable c/Cols         \ -- addr
Character height in DEFAULT-CONSOLE device.

variable dp-char        \ -- addr
Holds up to four ASCII values of double number separators. Unused bytes must be set to zero.

variable fp-char        \ -- addr
Holds up to four ASCII values of floating point number separators.

variable ign-char       \ -- addr
Holds ASCII values of characters that are ignored during number scanning. Set to ':' by default.

variable dir1-char      \ -- addr
CELL, holds the primary directory separator character used when scanning file names. Set to '\' by default for Windows/DOS and to '/' for Unix derivatives.

variable dir2-char      \ -- addr
CELL, holds the secondary directory separator character used when scanning file names. Set to '/' by default for Windows/DOS and to '\' for Unix derivatives.

variable FENCE          \ -- addr
End of protected dictionary.

variable VOC-LINK       \ -- addr
Links vocabularies.

variable wid-link       \ -- addr
Links word-lists.

variable res-link       \ -- addr
Links resources.

variable lib-link       \ -- addr
Links dynamic/shared libraries.

variable ovl-link       \ -- addr
Links active overlays.

variable ovl-id         \ -- addr
Holds unique overlay ID

variable <id>           \ -- addr
A variable that holds the next available ID number. See NEXTID: in the Resources Section.

variable import-func-link
Links imported API functions in shared libraries.

variable SCR            \ -- addr
For mass storage by old-timers.

variable BLK            \ -- addr
User input device: 0 for keyboard/file, non-zero is block number.

variable STATE          \ -- addr
Interpreting (0) or compiling (non-zero).

variable CSP
Stack pointer saved for error checking.

variable CURRENT        \ -- addr
Holds the wordlist/vocabulary in which new definitions are created.

vsize buffer: CONTEXT   \ -- addr
Search order array.

vsize buffer: MinContext        \ -- addr
A CONTEXT array for minimum search order.

variable LAST           \ -- addr
Points to last definition (after Link Field).

variable #THREADS       \ -- addr
Default number of threads in a new wordlist.

variable CHECKING       \ -- addr
True if checking structure definitions is enabled. Note that this variable may be removed in a future release.

2variable SOURCE-LINE-POS       \ -- addr
Contains double file position before refill.

variable Saved>IN       \ -- addr
Holds the value of >IN before each token parse in interpret.

variable <HeaderLess>   \ -- addr
A flag. Declares the presence of a header in the last definition.

variable 'SourceFile    \ -- addr
Pointer to source include struct for current file, or 0.

variable tabwordstop    \ -- addr
Cursor X Position for tab stops.

variable Optimising     \ -- addr
Variable is set TRUE when optimisation should be used.

variable NextUser
Next Valid offset for a new user variable.

variable OPERATORTYPE   \ -- addr
Set by prefix operators such as TO and ADDR.

variable Top-Mask       \ -- addr ; controls loop alignment
Mask that controls the alignment of loop heads during code generation.

variable TextChain      \ -- addr
Anchor for the linked list of error message structures.

variable debug1         \ -- addr
When set, INCLUDE displays the lines of the file.

Values

0 value FpSystem        \ -- n
The value FPSYSTEM defines which floating point pack is installed and active for compilation. See the Floating Point chapter for further details.

0 value original-xt     \ -- xt
Set during a redefinition to preserve the xt of the word being redefined.

Kernel DEFERred words

These words are DEFERred to allow later modification.

Input and Output

Although the standard Forth I/O functions are deferred, users are strongly encouraged to use the generic I/O mechanism rather than to change the global effect of the I/O words. The I/O words are DEFERred for historical reasons and to ease porting.

defer EMIT      \ char -- ; display char
Display char on the current I/O device.

defer EMIT?     \ -- ior
Return a non-zero ior if the current output device is ready to receive a character. The ior may be device dependent.

defer KEY       \ -- char ; receive char
Wait until the current input device receives a character and return it.

defer KEY?      \ -- flag ; check receive char
Return true if a character is available at the current input device.

defer EKEY      \ -- char ; receive char
Wait until the current input device receives a character and return it. Note that the behaviour of EKEY and EKEY? may be implementation dependent. See the ANS Forth standard for more details.

defer EKEY?      \ -- flag ; check receive char
Return true if a character is available at the current input device. Note that the behaviour of EKEY and EKEY? may be implementation dependent. See the ANS Forth standard for more details.

defer CR        \ -- ; display new line
Perform the equivalent of a CR/LF pair on the current output device. This action may be device dependent.

defer TYPE      \ c-addr len -- ; display string
Display/write the string on the current output device.

defer ACCEPT    \ c-addr +n1 -- +n2
Read a string of maximum size n1 characters to the buffer at c-addr, returning n2 the number of characters actually read. Input may be terminated by a CR. The action may be input device specific.

Kernel and Convenience

These words are deferred to improve kernel portability, and to provide points at which the default behaviour of the Forth kernel can be changed.

defer EntryPoint        \ hmodule 0 commandline show -- res
This word is the entry point from the startup code to the Forth system. The arguments follow the WinMain conventions, except that the command line may include the program name. See the chapter about creating turnkey applications for more and important details.

defer ABORT     \ i*x -- ; R: j*x -- ; error handler
Empty the data stack and perform the action of QUIT, which includes emptying the return stack, without displaying a message.

defer isNumber? \ caddr len -- d 2 | n 1 | 0
Attempt to convert the string caddr/len to an integer. The return result is either zero for failed, a single cell number and one for a single-cell conversion, or a double cell number and two for a double number conversion. The ASCII number string supplied can also contain an explicit radix (number base) override. A leading $ enforces hexadecimal, a leading # enforces decimal and a leading % enforces binary. Hexadecimal numbers can also be specified by a leading '0x' or trailing 'h'. After a floating point pack has been compiled from the Lib directory, the action of NUMBER? is changed to accept floating point numbers as well as integers.

: NUMBER?   \ addr -- d 2 | n 1 | 0
As isNumber? but takes a counted string.

defer ShowSourceOnErrorHook     \ --
Performed at the end of SHOWSOURCEONERROR.

defer EditOnError       \ --
Performed in DOERRORMESSAGE. The default is NOOP. This word is assigned a new action by the Studio environment.

defer pause     \ --
The multitasker is installed here. Until a multitasker is installed the action is NOOP or YIELD. Do not call PAUSE inside callbacks.

defer ms        \ n --
Wait for n milliseconds.

defer ticks     \ -- n
Return the system timer value in milliseconds. Treat the returned value as a 32 bit unsigned number that wraps on overflow.

defer interpret \ i*x -- j*x ; process current input line
Process the current input line as if text entered at the keyboard.

defer QUIT              \ -- ; R: i*x --                        6.1.2050
Empty the return stack, store 0 in SOURCE-ID, make the console the current input device, and enter interpretation state. QUIT repeatedly ACCEPTs a line of input and INTERPRETs it, with a prompt if in interpretation state. See the separate chapters on error handling and internationalisation for details of error message display.

defer .Prompt   \ --
The Forth console prompt.

GUI interface hooks

These words provide hooks into systems, both GUI and kernel, which use message passing or event handlers. These words are mostly used by Generic I/O devices while waiting.

defer Idle              \ --
Windows only: Despatches the next message, waiting if none are present. Idle only returns when a message has been received.

defer WaitIdle          \ --
Linux, OS X and DOS: Despatches the next message/event, waiting if none are present. WaitIdle only returns when a message has been received.

defer BusyIdle          \ --
Despatches one message/event if available. The word returns immediately if no messages are available. The default action is (BusyIdle). See also EmptyIdle.

defer EmptyIdle \ --
Empty the message/event loop, returning when no messages are available. EmptyIdle can be used in applications to ensure that the GUI system has an opportunity to process messages/events.

Logic functions

Perform various logic and bit based operations on stack items.

: and           \ n1 n2 -- n3                                   6.1.0720
Perform a logical AND between the top two stack items and retain the result in top of stack.

: or            \ n1 n2 -- n3                                   6.1.1980
Perform a logical OR between the top two stack items and retain the result in top of stack.

: xor           \ n1 n2 -- n3                                   6.1.2490
Perform a logical XOR between the top two stack items and retain the result in top of stack.

: invert        \ n1 -- ~n1                                     6.1.1720
Perform a bitwise inversion.

: not           \ n1 -- n1
Perform a bitwise NOT on the top stack item and retain result. OBSOLETE but retained because of widespread use. In VFX, not is the same as invert, but in other Forth systems not may be the same as 0=.

: and!          \ x addr --
Logical AND x into the cell at addr.

: or!           \ x addr --
Logical OR x into the cell at addr.

: xor!           \ x addr --
Logical XOR x into the cell at addr.

: bic!           \ x addr --
Invert x and logical AND it into the cell at addr. The effect is to clear the bits at addr that are set in x.

: false=        \ n1 -- flag
Perform a logical NOT on the top stack item.

Stack manipulations

The following words manipulate items on the data and return stacks

: NOOP          \ --
A NOOP, null instruction.

: NIP           \ x1 x2 -- x2                                   6.2.1930
Dispose of the second item on the data stack.

: TUCK          \ x1 x2 -- x2 x1 x2                             6.2.2300
Insert a copy of the top data stack item underneath the current second item.

: PICK          \ xu .. x0 u -- xu .. x0 xu                     6.2.2030
Get a copy of the Nth data stack item and place on top of stack. 0 PICK is equivalent to DUP.

: RPICK         \ n -- a
Get a copy of the Nth return stack item and place on top of stack.

: ROLL          \ nn..n0 n -- nn-1..n0 nn                       6.2.2150
Rotate the order of the top N stack items by one place such that the current top of stack becomes the second item and the Nth item becomes TOS. See also ROT.

: nDrop         \ XN..X1 N -- xn..x2
Drop N items from the data stack.

: ROT           \ n1 n2 n3 -- n2 n3 n1                          6.1.2160
ROTate the positions of the top three stack items such that the current top of stack becomes the second item. See also ROLL.

: -ROT          \ n1 n2 n3 -- n3 n1 n2
The inverse of ROT.

: >R            \ x -- ; R: -- x                                6.1.0580
Push the current top item of the data stack onto the top of the return stack.

: R>            \ -- x ; R: x --                                6.1.2060
Pop the top item off the return stack and place on the data stack.

: R@            \ -- x                                          6.1.2070
Copy the top item of the return stack and place on the data stack.

: 2>R           \ x1 x2 -- ; R: -- x1 x2                        6.2.0340
Transfer the two top data stack items to the return stack.

: 2R>           \ -- x1 x2 ; R: x1 x2 --                        6.2.0410
Transfer the top two return stack items to the data stack.

: 2R@           \ -- x1 x2 ; R: x1 x2 -- x1 x2                  6.2.0415
Copy the top two return stack items to the data stack.

: N>R           \ xn .. x1 N -- ; R: -- x1 .. xn n
Transfer N items and count to the return stack.

: NR>           \ -- xn .. x1 N ; R: x1 .. xn N --
Pull N items and count off the return stack.

: DROP          \ x --                                          6.1.1260
Lose the top data stack item and promote NOS to TOS.

: 2DROP         \ x1 x2 --                                      6.1.1290
Discard the top two data stack items.

: 3drop         \ x1 x2 x3 --
Discard the top three data stack items.

: 4drop         \ x1 x2 x3 x4 --
Discard the top four data stack items.

: SWAP          \ x1 x2 -- x2 x1                                6.1.2260
Exchange the top two data stack items.

: 2SWAP         \ x1 x2 x3 x4 -- x3 x4 x1 x2                    6.1.0430
Exchange the top two cell-pairs on the data stack.

: DUP           \ x -- x x                                      6.1.1290
Duplicate the top stack item.

: ?DUP          \ x --  0 | x x                                 6.1.0630
DUPlicate the top stack item only if it is non-zero.

: 2rot          \ 1 2 3 4 5 6 -- 3 4 5 6 1 2                   8.6.2.0420
Perform ROT operation on 3 double numbers.

: 2DUP          \ x1 x2 -- x1 x2 x1 x2                          6.1.0380
DUPlicate the top cell-pair on the data stack.

: 3dup          \ x1 x2 x3 -- x1 x2 x3 x1 x2 x3
DUPlicate the top three items on the data stack.

: 4dup          \ x1 x2 x3 x4 -- x1 x2 x3 x4 x1 x2 x3 x4
DUPlicate the top 4 data stack items.

: OVER          \ x1 x2 -- x1 x2 x1                             6.1.1990
Copy NOS to a new top-of-stack item.

: 2OVER         \ x1 x2 x3 x4 -- x1 x2 x3 x4 x1 x2              6.1.0400
Similar to OVER but works with cell-pairs rather than cell items.

: UP@           \ -- up
Get the current address value of the user-area pointer.

: UP!           \ up --
Set the current address value of the user-area pointer.

: SP@           \ -- n
Get the current address value of the data-stack pointer.

: SP!           \ n --
Set the current address value of the data-stack pointer.

: RP@           \ -- m
Get the current address value of the return-stack pointer.

: RP!           \ m --
Set the current address value of the return-stack pointer.

: DEPTH         \ -- +n                                         6.1.1200
Return the number of items on the data stack, excluding the count.

: RDEPTH        \ -- +n
Return the number of items on the return stack.

: min           \ n1 n2 -- n1|n2                                6.1.1880
Given two data stack items preserve only the smallest.

: MAX           \ n1 n2 -- n1|n2                                6.1.1870
Given two data stack items preserve only the largest.

: umin           \ n1 n2 -- n1|n2
Given two data stack items preserve only the smallest.

: umax           \ n1 n2 -- n1|n2
Given two data stack items preserve only the largest.

: LOWORD        \ n -- n16
Mask off the low 16 bits of a cell.

: HIWORD        \ n -- n16
Mask off the high 16 bits of a cell and shift right by 16 bits.

: MAKELONG      \ lo hi -- 32bit
Given two 16 bit numbers produce a single 32 bit one.

: nslWiden      \ ... n --
Sign extend from 32 bits to cell width on Nth stack item.

: nswWiden      \ ... n --
Sign extend from 16 bits to cell width on Nth stack item.

: nsbWiden      \ ... n --
Signed extend from 8 bits to cell width on Nth stack item.

: nulWiden      \ ... n --
Zero extend from 32 bits to cell width on Nth stack item.

: nuwWiden      \ ... n --
Zero extend from 16 bits to cell width on Nth stack item.

: nubWiden      \ ... n --
Zero extend from 8 bits to cell width on Nth stack item.

Comparisons

Various words to compare stack items and return flags.

: 0=            \ n -- t/f                                       6.1.0270
Compare the top stack item with 0 and return TRUE if equals.

: 0<>           \ n -- t/f                                       6.2.0260
Compare the top stack item with 0 and return TRUE if not-equal.

: 0<            \ n -- t/f                                       6.1.0250
Return TRUE if the top of stack is less-than-zero.

: 0>            \ n -- t/f                                       6.2.0280
Return TRUE if the top of stack is greater-than-zero.

: =             \ n1 n2 -- t/f                                   6.1.0530
Return TRUE if the two topmost stack items are equal.

: <>            \ n1 n2 -- t/f                                   6.2.0500
Return TRUE if the two topmost stack items are different.

: <             \ n1 n2 -- t/f                                   6.1.0480
Return TRUE if the second stack item is less than the topmost.

: >             \ n1 n2 -- t/f                                   6.1.0540
Return TRUE if the second stack item is greater than the topmost.

: <=            \ n1 n2 -- t/f
Return TRUE if the second stack item is less than or equal to the topmost.

: >=            \ n1 n2 -- t/f
Return TRUE if the second stack item is greater than or equal to the topmost.

: U>            \ n1 n2 -- t/f                                   6.2.2350
An UNSIGNED version of >.

: U<            \ n1 n2 -- t/f                                   6.1.2340
An UNSIGNED version of <.

: U>=           \ n1 n2 -- t/f
An UNSIGNED version of >=.

: U<=           \ n1 n2 -- t/f
An UNSIGNED version of <=.

: WITHIN?       \ n1 n2 n3 -- flag
Return TRUE if N1 is within the range N2..N3. This word uses signed arithmetic.

: WITHIN        \ n1|u1 n2|u2 n3|u3 -- flag                      6.2.2440
Return TRUE if n2|u2 <= n1|u1 < n3 The ANS version of WITHIN?. Note the conditions This word uses unsigned arithmetic, so that signed compares are treated as existing on a number circle.

: D0<           \ d -- flag                                   8.6.1.1075
Return true if d is less than zero (is negative).

: D0=           \ d -- flag                                   8.6.1.1080
Return true if d is zero.

: D0<>          \ d -- flag
Return true if d is non-zero.

: d=            \ d1 d2 -- flag                                8.6.1.1080
Return true if the two double numbers are equal.

: d<            \ d1 d2 -- flag                                8.6.1.1110
Return TRUE if the double number d1 is less than the double number d2.

: d>            \ d1 d2 -- flag
Return TRUE if the double number d1 is greater than the double number d2.

: dmax          \ d1 d2 -- d1|d2                               8.6.1.1210
Return the maximum double number from the two supplied.

: dmin          \ d1 d2 -- d1|d2                               8.6.1.1220
Return the minimum double number from the two supplied.

: DU<           \ ud1 ud2 -- flag
True if ud1<ud2.

: DU>           \ ud1 ud2 -- flag
True if ud1>ud2.

Arithmetic Operators.

Shifts

: LSHIFT        \ x1 u -- x2                                     6.1.1805
Logically shift X1 by U bits left. The result of shifting by more than 31 bits is undefined.

: RSHIFT        \ x1 u -- x2                                     6.1.2162
Logically shift X1 by U bits right. The result of shifting by more than 31 bits is undefined.

: arshift       \ x1 u -- x2
Shift x1 right by u bits, filling with the previous top bit. An arithmetic right shift. The result of shifting by more than 31 bits is undefined.

: ROL           \ x1 u -- x2
Logically rotate X1 by U bits left. The result of shifting by more than 31 bits is undefined.

: ROR           \ x1 u -- x2
Logically rotate X1 by U bits right. The result of shifting by more than 31 bits is undefined.

Multiplication

: *             \ n1 n2 -- n3                                    6.1.0090
Standard signed multiply. N3 = n1 * n2.

: M*            \ n1 n2 -- d                                     6.1.1810
Signed multiply yielding double result.

: UM*           \ u1 u2 -- ud                                    6.1.2360
Perform unsigned-multiply between two numbers and return double result.

: D2*           \ d1 -- d1*2                                   8.6.1.1090
Multiply the given double number by two.

Division

The ANS specification contains a discussion of symmetric and floored division.

Division produces a quotient q and a remainder r by dividing operand a by operand b. Division operations return q, r, or both. The identity

  b*q + r = a

shall hold for all a and b.

When unsigned integers are divided and the remainder is not zero, q is the largest integer less than the true quotient.

When signed integers are divided, the remainder is not zero, and a and b have the same sign, q is the largest integer less than the true quotient. If only one operand is negative, whether q is rounded toward negative infinity (floored division) or rounded towards zero (symmetric division) is implementation defined.

Floored division is integer division in which the remainder carries the sign of the divisor or is zero, and the quotient is rounded to its arithmetic floor. Symmetric division is integer division in which the remainder carries the sign of the dividend or is zero and the quotient is the mathematical quotient rounded towards zero or truncated. Examples of each are shown in the tables below.


Floored Division Example

Dividend        Divisor Remainder       Quotient
--------        ------- ---------       --------
10                 7       3                1
-10                7       4               -2
10                -7      -4               -2
-10               -7      -3                1

Symmetric Division Example
Dividend        Divisor Remainder       Quotient
--------        ------- ---------       --------
10                 7       3                1
-10                7      -3               -1
10                -7       3               -1
-10               -7      -3                1

Unless otherwise noted or specified, VFX Forth uses symmetric division.

: UM/MOD        \ ud u -- urem uquot                             6.1.2370
Perform unsigned division of double number UD by single number U and return remainder and quotient.

: SM/REM        \ d1 n1 -- n2 n3                                6.1.2214
Divide d1 by n1, giving the symmetric quotient n3 and the remainder n2. Input and output stack arguments are signed. An ambiguous condition exists if n1 is zero or if the quotient lies outside the range of a single-cell signed integer.

: FM/MOD        \ d1 n1 -- n2 n3                                6.1.1561
Divide d1 by n1, giving the floored quotient n3 and the remainder n2. Input and output stack arguments are signed. An ambiguous condition exists if n1 is zero or if the quotient lies outside the range of a single-cell signed integer.

: MU/MOD        \ ud1 u2 -- urem ud#quot
Perform an unsigned divide of a double by a single, returning a single remainder and a double quotient.

: /MOD          \ n1 n2 -- rem quot                              6.1.0240
Signed division of N1 by N2 (single-precision) yielding remainder and quotient.

: /             \ n1 n2 -- n3                                    6.1.0230
Standard signed division operator. n3 = n1/n2.

: u/            \ u1 u2 -- u3
Unsigned division operator. U3 = u1/u2.

: MOD           \ n1 n2 -- n3                                    6.1.1890
Return remainder of division of N1 by N2. n3 = n1 mod n2.

: M/            \ d n1 -- n2
Signed divide of a double by a single integer.

: D2/           \ d1 -- d1/2                                   8.6.1.1100
Divide the given double number by two. Signed and implemented as an arithmetic right shift, and so produces floored dividion.

Combined multiply and divide

These words provide combined multiply and divide operations with extended precision intermediate results. The point is to prevent overflow during integer scaling operations.

: */MOD         \ n1 n2 n3 -- rem quot                          6.1.0110
Multiply n1 by n2 to give a double precision result, and then divide it by n3 returning the remainder and quotient. The point of this operation is to avoid loss of precision.

: */            \ n1 n2 n3 -- n4                                6.1.0100
Multiply n1 by n2 to give a double precision result, and then divide it by n3 returning the quotient. The point of this operation is to avoid loss of precision.

: m*/           \ d1 n2 +n3 -- dquot
The result dquot=(d1*n2)/n3. The intermediate value d1*n2 is triple-precision. In an ANS Forth standard program n3 can only be a positive signed number and a negative value for n3 generates an ambiguous condition, which may cause an error on other implementations.

Traditional short forms

: 1+            \ n1|u1 -- n2|u2                                 6.1.0290
Add one to top-of stack.

: 2+            \ n1|u1 -- n2|u2
Add two to top-of stack.

: 4+            \ n1|u1 -- n2|u2
Add four to top-of stack.

: 1-            \ n1|u1 -- n2|u2                                 6.1.0300
Subtract one from top-of stack.

: 2-            \ n1|u1 -- n2|u2
Subtract two from top-of stack.

: 4-            \ n1|u1 -- n2|u2
Subtract four from top-of stack.

: 2*            \ x1 -- x2                                       6.1.0320
Signed multiply top of stack by 2.

: 4*            \ x1 -- x2
Signed multiply top of stack by 4.

: 8*            \ x1 -- x2
Signed multiply top of stack by 8. In 64 bit systems only.

: 2/            \ x1 -- x2                                       6.1.0330
Right shift x1 one bit, sign preserved. From build 1276 onwards, this is an ANS compliant signed right shift. For an unsigned result, use U2/ or 1 RSHIFT.

: U2/           \ x1 -- x2
Unsigned divide top of stack by 2.

: 4/            \ x1 -- x2
Right shift x1 two bits, sign preserved. From build 1276 onwards, this is an ANS compliant signed right shift. For an unsigned result, use U4/ below or 2 RSHIFT.

: u4/           \ x1 -- x2
Unsigned divide top of stack by 4.

: 8/            \ x1 -- x2
Right shift x1 three bits, sign preserved For an unsigned result, use U8/ below or 3 RSHIFT.

: u8/           \ x1 -- x2
Unsigned divide top of stack by 8.

Addition and subtraction

: +             \ n1|u1 n2|u2 -- n3|u3                           6.1.0120
Add two single precision integer numbers.

: -             \ n1|u1 n2|u2 -- n3|u3                           6.1.0160
Subtract two single precision integer numbers.

: D+            \ d1 d2 -- d3                                  8.6.1.1040
Add two double precision integers together.

: D-            \ d1 d2 -- d3                                  8.6.1.1050
Subtract two double precision integers. D3=D1-D2.

: M+            \ d1 n -- d2                                   8.6.1.1830
Add double d1 to sign extended single n to form double d2.

Negation and absolution

: NEGATE        \ n1 -- n2                                       6.1.1910
Negate a single precision integer number.

: ?NEGATE       \ n1 flag -- n1|n2
If flag is negative, then negate n1.

: ABS           \ n -- u                                         6.1.0690
If n is negative, return its positive equivalent (absolute value).

: DNEGATE       \ d -- -d                                     8.6.1.1230
Negate a double number.

: ?dnegate      \ d n -- d'
If n is negative, negate the double number d.

: DABS          \ d -- |d|                                     8.6.1.1160
Double precision version of ABS.

Converting between single and double numbers

: S>D           \ n -- d                                         6.1.2170
Convert a single number to a double one.

: D>S           \ d -- n                                       8.6.1.1140
Convert a Double number to a single.

Portability aids

These words make porting code between 16, 32, and 64 bit systems much easier. They avoid the use of heritage shortforms such as 2+ and 4+ which are dependent on the size of items on the data stack and in memory.

: CELL+         \ a-addr1 -- a-addr2                             6.1.0880
Add size of a cell to the top-of stack.

: CELLS         \ n1 -- n2                                       6.1.0890
Return size in address units of N1 cells in memory.

: CELL/         \ n1 -- n2
Divide top stack item by the size of a cell.

: CELLS+        \ n1 n2 -- n3
Modify address 'n1' by the size of 'n2' cells.

: CELL-         \ a-addr1 -- a-addr2
Decrement an address by the size of a cell.

: CELL          \ -- n
Return the size in address units of one cell.

: CHAR+         \ c-addr1 -- c-addr2                             6.1.0897
Increment an address by the size of a character.

: CHARS         \ n1 -- n2                                       6.1.0898
Return size in address units of N1 characters.

: cellbits      \ -- u
Count the number of bits in a cell - relies on 2s complement arithmetic. Useful when porting code between 16, 32 and 64 systems. Note that this is not a constant; the calculation is made at each use.

Dictionary Memory Manipulation

The following definitions provide the primitives for manipulation of dictionary memory.

variable DP     \ -- addr
Holds the address of the next free location in the dictionary.

: HERE          \ -- addr                                        6.1.1650
Return the current dictionary pointer which is the first address-unit of free space within the system.

: ALLOT         \ n --                                           6.1.0710
Allocate N address-units of data space from the current value of HERE and move the pointer.

: ALLOT&ERASE   \ n --
Allot n bytes of dictionary space and fill with Zero.

: ,             \ x --                                           6.1.0150
Place the CELL value X into the dictionary at HERE and increment the pointer.

: L,             \ x --                                           6.1.0150
Place the 32 bit value X into the dictionary at HERE and increment the pointer.

: W,            \ x --
Place the WORD value X into the dictionary at HERE and increment the pointer.

: C,            \ x --                                           6.1.0860
Place the CHAR value X into the dictionary at HERE and increment the pointer.

: ALIGNED       \ addr -- a-addr                                 6.1.0706
Given an address pointer this word will return the next ALIGNED address subject to system wide alignment restrictions.

: HALF-ALIGNED  \ addr -- a-addr
Align an address pointer to within half the size of a CELL.

: ALIGN         \ --                                             6.1.0705
Align dictionary pointer using the same rules as ALIGNED. Unused dictionary space is ERASEd.

: HALF-ALIGN    \ --
Align the dicionary pointer to a half-cell boundary. Unused dictionary space is ERASEd. )

Branch and flow control

The following definitions allow for a variety of loops and conditional execution contructs.

: I             \ -- n                                          6.1.1680
Return the current index of the inner-most DO ... LOOP.

: J             \ -- n                                          6.1.1730
Return the current index of the second DO ... LOOP.

: unloop        \ -- ; R: loop-sys --                           6.1.2380
Remove the DO ... LOOP control parameters from the return stack.

: BOUNDS        \ addr len -- addr+len addr
Modify the address and length parameters to provide an end-address and start-address pair suitable for a DO ... LOOP construct.

: EXIT          \ -- ; R: next-sys --                           6.1.1380
Compile code into the current definition to cause a definition to terminate. This is the Forth equivalent to inserting an RTS/RET instruction in the middle of an assembler subroutine.

: EXECUTE       \ xt --                                         6.1.1370
Execute the code described by the XT. This is a Forth equivalent of an assembler JSR/CALL instruction.

: perform       \ addr --
EXECUTE contents of addr if non-zero.

: DO            \ Run: n1|u1 n2|u2 -- ; R: -- loop-sys           6.1.1240
Begin a DO ... LOOP construct. Takes the end-value and start-value from the stack.

: ?DO           \ Run: n1|u1 n2|u2 -- ; R: -- | loop-sys         6.2.0620
Compile a DO which will only begin loop execution if the loop parameters do not specify an interation count of 0.

: LOOP          \ Run: -- ; R: loop-sys1 -- | loop-sys2         6.1.1800
The closing statement of a DO ... LOOP construct. Increments the index and terminates when the index crosses the limit.

: +LOOP         \ Run: n -- ; R: loop-sys1 -- | loop-sys2       6.1.0140
As LOOP except that you specify the increment on the stack. The action of n +LOOP is peculiar when n is negative:

  -1 0 ?DO  i .  -1 +LOOP

prints 0 -1, whereas:

  0 0 ?DO  i .  -1 +LOOP

prints nothing. This a result of the mathematical trick used to detect the terminating condition. To prevent confusion avoid using n +LOOP with negative n.

: LEAVE         \ -- ; R: loop-sys --                           6.1.1760
Compile code to exit a DO ... LOOP. Similar to 'C' language break.

: ?LEAVE        \ flag -- ; R: loop-sys --
A version of LEAVE which only takes effect if the given flag is non-zero.

: BEGIN         \ C: -- dest ; Run: --                          6.1.0760
Mark the start of a BEGIN..[WHILE]..UNTIL/AGAIN/REPEAT construct.

: AGAIN         \ C: dest -- ; Run: --                          6.2.0700
The end of a BEGIN..AGAIN construct which specifies an infinite loop.

: UNTIL         \ C: dest -- ; Run: flag --                      6.1.2390
Compile code into definition which will jump back to the matching BEGIN if the supplied condition flag is zero (false).

: WHILE         \ C: dest -- orig dest ; Run: flag --            6.1.2430
Separate the condition test from the loop code in a BEGIN ... WHILE ... REPEAT block.

: REPEAT        \ C: orig dest -- ; Run: --                     6.1.2140
Loop back to the conditional test code in a BEGIN ... WHILE ... REPEAT construct.

: IF            \ C: -- orig ; Run: x --                        6.1.1700
Mark the start of an IF ... [ELSE] ... THEN conditional block. ELSE is optional.

: THEN          \ C: orig -- ; Run: --                          6.1.2270
Mark the end of an IF..THEN or IF..ELSE..THEN conditional block.

: ENDIF         \ C: orig -- ; Run: --
An alias for THEN. Note that ANS Forth describes THEN not ENDIF.

: AHEAD         \ C: -- orig ; Run: --                       15.6.2.0702
Start an unconditional forward branch which will be resolved later.

: ELSE          \ C: orig1 -- orig2 ; Run: --                   6.1.1310
Begin the failure condition code for an IF.

: CASE          \ C: -- case-sys ; Run: --                      6.2.0873
Begin a CASE..ENDCASE construct. Similar to the C switch.

: OF            \ C: -- of-sys ; Run: x1 x2 -- | x1             6.2.1950
Begin conditional block for CASE, executed when the switch value x1 is equal to x2.

: ?OF           \ C: -- of-sys ; Run: flag --
Begin conditional block for CASE, executed when the flag is true.

: ENDOF         \ C: case-sys1 of-sys -- case-sys2 ; Run: --    6.2.1343
Mark the end of an OF conditional block within a CASE construct. Compile a jump past the ENDCASE marker at the end of the construct.

: ENDCASE       \ C: case-sys -- ; Run: x --                    6.2.1342
Terminate a CASE ... ENDCASE construct. DROPs the switch value from the stack.

: END-CASE      \ C: case-sys -- ; Run: --
A Version of ENDCASE which does not drop the switch value. Used when the switch value itself is consumed by a default condition or another result is to be returned.

: BEGINCASE     \ C: -- case-sys ; Run: --                      6.2.0873
Start a BEGINCASE ... NEXTCASE construct. This construct is a loop which uses OF ... ENDOF clauses like CASE ... ENDCASE, but the loop terminates after the action in the OF ... ENDOF clause. BEGINCASE ... NEXTCASE is used to construct multiple-exit loops without the appearance of spaghetti code.

: NEXTCASE      \ C: case-sys -- ; Run: x --
Terminate a BEGINCASE ... NEXTCASE construct. DROPs the switch value and branches back to BEGINCASE. Note that from VFX Forth 4.0, BEGINCASE must be used with NEXTCASE.

: NEXT-CASE     \ C: case-sys -- ; Run: --
A Version of NEXTCASE which does not drop the switch value. Used when the switch value itself is consumed by a default condition. Note that from VFX Forth 4.0, BEGINCASE must be used with NEXTCASE.

: cs-pick       \ xu .. x0 u -- xu .. x0 xu                  15.6.2.1015
Get a copy of the uth compilation stack item.

: cs-roll       \ xu..x0 u -- xu-1..x0 xu                    15.6.2.1020
Rotate the order of the top u compilation stack items by one place such that the current top of stack becomes the second item and the uth item becomes TOS.

: cs-drop       \ x --
Discard the top item on the compilation stack.

: RECURSE       \ Comp: --                                      6.1.2120
Compile a recursive call to the colon definition containing RECURSE itself. Do not use RECURSE between DOES> and ;. Used in the form:

: foo  ... recurse ... ;

to compile a reference to FOO from inside FOO.

Memory operators

The following words are used to operate on memory locations and arbitrary memory blocks.

: ON            \ addr --
Given the address of a CELL this will set its contents to TRUE.

: OFF           \ addr --
Given the address of a CELL this will set its contents to FALSE.

: @OFF          \ addr -- x
Read cell at addr, and set it to 0.

: @on           \ addr -- val
Fetch contents of addr and set to -1.

: +!            \ n addr --                                      6.1.0130
Add N to the CELL at memory address ADDR.

: w+!           \ w addr --
Add W to the 16 bit word at memory address ADDR.

: C+!           \ b addr --
Add B to the character (byte) at memory address ADDR.

: -!            \ n addr --
Subtract N from the CELL at memory address ADDR.

: w-!           \ w addr --
Subtract W from the 16 bit word at memory address ADDR.

: C-!           \ b addr --
Subtract B from the character (byte) at memory address ADDR.

: incr          \ a-addr --
Increment the data cell at a-addr by one.

: decr          \ a-addr --
Decrement the data cell at a-addr by one.

: 2@            \ a-addr -- x1 x2                                6.1.0350
Fetch and return the two CELLS from memory ADDR and ADDR+sizeof(CELL). The cell at the lower address is on the top of the stack.

: @             \ addr -- n                                      6.1.0650
Fetch and return the CELL at memory ADDR.

: l@            \ addr -- val
Fetch and 0 extend the word (32 bit) at memory ADDR.

: w@            \ addr -- val
Fetch and 0 extend the word (16 bit) at memory ADDR.

: c@            \ addr -- val                                    6.1.0870
Fetch and 0 extend the character at memory ADDR

: l@s            \ addr -- val
Fetch and sign extend the word (32 bit) at memory ADDR. This word is in 64 bit systems only.

: w@s           \ addr -- val(signed)
A sign extending version of W@.

: c@s           \ addr -- val(signed)
A sign extending version of C@.

: 2!            \ x1 x2 addr --                                  6.1.0310
Store the two CELLS x1 and x2 at memory ADDR. X2 is stored at ADDR and X1 is stored at ADDR+CELL.

: !             \ n addr --                                      6.1.0010
Store the CELL quantity N at memory ADDR.

: l!            \ val addr --
Store the word (32 bit) quantity VAL at memory ADDR.

: w!            \ val addr --
Store the word (16 bit) quantity VAL at memory ADDR.

: c!            \ val addr --                                    6.1.0850
Store the character VAL at memory ADDR.

: fill          \ addr len char --                               6.1.1540
Fill LEN bytes of memory starting at ADDR with the byte information specified as CHAR.

: set-bit       \ mask c-addr --
Apply the mask ORred with the contents of c-addr. Byte operation.

: clear-bit     \ mask c-addr --
Apply the mask inverted and ANDed with the contents of c-addr. Byte operation.

: toggle-bit    \ mask c-addr --
Invert the bits at c-addr specified by the mask. Byte operation.

: test-bit      \ mask addr -- flag
AND the mask with the contents of addr and return true if the result is non-zero (-1) or false (0) if the result is zero.

: cmove         \ addr1 addr2 count --                        17.6.1.0910
Copy COUNT bytes of memory forwards from ADDR1 to ADDR2. Note that as VFX Forth characters are 8 bit units, there is an implicit connection between a byte and a character.

: cmove>        \ addr1 addr2 count --                        17.6.1.0920
As CMOVE but working in the opposite direction, copying the last character in the string first. Note that as VFX Forth characters are 8 bit units, there is an implicit connection between a byte and a character.

: MOVE          \ addr1 addr2 u --                              6.1.1900
An intelligent memory move which avoids memory overlap problems. Note that as VFX Forth characters are 8 bit units, there is an implicit connection between a byte and a character.

: movex         \ src dest +n --
An optimised version of MOVE. If n<=0, no action is taken.

: ERASE         \ a-addr u --                                    6.2.1350
Fill U bytes of memory from A-ADDR with 0.

: BLANK         \ a-addr u --                                 17.6.1.0780
Blank U bytes of memory from A-ADDR using ASCII 32 (space)

: UNUSED        \ -- u                                           6.2.2395
Return the number of bytes free in the dictionary.

String operators

The following words are used to operate on strings. With care, some of them may also be used on arbitrary memory blocks.

In modern Forth strings are usually described by caddr/len pairs on the stack ( -- caddr len), where caddr points to first character and len is the number of characters in the string. Another form often used is counted strings { -- caddr ) in which caddr points to a count byte that is then followed by that many characters. Zero terminated strings are supported and are used for interfacing with the operating system and other libraries. Zero terminated string handling is described in a separate section of this manual.

In VFX Forth implementations for byte-addressed CPUs such as are used on PCs, a character is a byte-sized item. This means that the common assumption that a character=byte is true. However, if your code has to be ported to CPUs for which this assumption is not true (e.g. DSPs) or for which the size of a character is not one byte, then be very careful.

Caddr/len strings

: /string       \ addr len n -- addr+n len-n                  17.6.1.0245
Modify a string address and length to remove the first N characters from the string.

: SKIP          \ c-addr u char -- 'c-addr 'u
Modify string description by skipping over leading occurrences of char. Note that when a space char is given, tabs are also ignored.

: scan          \ caddr u char -- caddr2 u2
Look for first occurrence of char in string and return the new string. C-addr2/u2 describe the string with char as the first character. Note that when a space char is given, a tab is also treated as a space.

: -TRAILING     \ c-addr u1 -- c-addr u2                      17.6.1.0170
Modify a string address/length pair to ignore any trailing space or tab characters.

: -leading      \ caddr len -- caddr' len'
Modify a string address/length pair to ignore any leading space or tab characters.

: -white        \ caddr len -- caddr' len'
Remove leading and trailing white space from a string.

: UPC           \ char -- char'
Convert supplied character to upper case if it was alphabetic otherwise return the unmodified character. UPC is English language specific.

: UPPER         \ addr len --
Convert the ASCII string described to upper-case. This operation happens in place. UPPER is English language specific.

: ucmove        \ addr1 addr2 len --
Copy len bytes/characters of memory forwards from addr1 to addr2, converting to upper case. Note that as VFX Forth characters are 8 bit units, there is an implicit connection between a byte and a character.

: ucmove>       \ addr1 addr2 len --
Copy len bytes/characters of memory backwards starting at addr1_len-1 to addr2+len-1, converting to upper case. Note that as VFX Forth characters are 8 bit units, there is an implicit connection between a byte and a character.

: umove         \ addr1 addr2 u --
An intelligent memory move which avoids memory overlap problems. Characters are converted to upper case during the move. Note that as VFX Forth characters are 8 bit units, there is an implicit connection between a byte and a character.

: uplace        \ c-addr1 u c-addr2 --
Copy the string described by c-addr1/u to an upper-case counted string at c-addr2.

: s=            \ addr1 addr2 len -- flag
Compare two same-length strings or memory blocks, returning true if they are identical.

: str=          \ addr1 len1 addr2 len2 -- flag
Compare two addr/len memory blocks, returning true if they are identical both in length and contents. The comparison is case sensitive.

: is=           \ c-addr1 c-addr2 u -- flag
Compare two same-length strings/memory blocks, returning true if they are identical. The comparison is case insensitive.

: istr=         \ addr1 len1 addr2 len2 -- flag
Compare two addr/len memory blocks, returning TRUE if they are identical both in length and contents. The comparison is case insensitive.

: compare       \ c-addr1 u1 c-addr2 u2 -- n                    17.6.1.0935
Compare two strings. The return result is 0 for a match or can be -ve/+ve indicating string differences. If the two strings are identical, n is zero. If the two strings are identical up to the length of the shorter string, n is minus-one (-1) if u1 is less than u2 and one (1) otherwise. If the two strings are not identical up to the length of the shorter string, n is minus-one (-1) if the first non-matching character in the string specified by c-addr1 u1 has a lesser numeric value than the corresponding character in the string specified by c-addr2 u2 and one (1) otherwise.

: icompare      \ c-addr1 u1 c-addr2 u2 -- n
A case insensitive version of COMPARE.

: SEARCH        \ c-addr1 u1 c-addr2 u2 -- c-addr3 u3 f     17.6.1.2191
Search the string c-addr1/u1 for the string (\i{c-addr2/u2}. If a match is found return c-addr3/u3, the address of the start of the match and the number of characters remaining in c-addr1/u1, plus flag f set to true. If no match was found return c-addr1/u1 and f=0. Case sensitive.

: instring      \ pattern lenp source lens -- flag
Return true if the source text contains the pattern text. Case-sensitive.

: $Null         \ -- caddr 0
Return a null string.

: extractNum    \ caddr len base -- caddr' len' u
Extract a number in the given base from the start of the string, returning the remaining string starting at the first non-numeric character and the converted number.

: ExtractText   \ caddr len char -- raddr rlen laddr llen
Extract text delimited by char from the string caddr/len. Text before the leading delimiter is ignored. Return the string remaining and string between the delimiters. For example:

  s"     'foo' 1 2 10 " char ' ExtractText

will return the strings " 1 2 10 " and "foo". If either of the delimiters is not present, the original string is returned as raddr/rlen and laddr/llen is a null string.

: csplit        \ addr len char -- raddr rlen laddr llen
Extract a substring at the start of addr/len, returning the string raddr/rlen which includes char (if found) and the string laddr/llen which contains the text to left of char. If the string does not contain the character, raddr is addr+len and rlen=0.

: not-overlapped?       \ caddr1 len1 caddr2 len2 --
Return true if the two strings do not overlap.

: overlapped?   \ caddr1 len1 caddr2 len2 --
Return true if the two strings overlap.

Counted strings

create cNull    \ -- addr
Return the address of an empty counted string.

: place         \ c-addr1 u c-addr2 --
Copy the string described by c-addr1 u to a counted string at the memory address described by c-addr2.

: count         \ addr1 -- addr2 len                             6.1.0980
Given the address of a counted string in memory this word will return the address of the first character and the length in characters of the string.

: $move         \ caddr1 caddr2 -- ; move counted string
Copy a counted string from caddr1 to caddr2. Overlapped strings are handled properly.

: SMOVE         \ caddr1 caddr2 --
Copy the counted string at caddr1 to caddr2. Overlapped strings are handled properly.

: addchar       \ char string --
Add the character to the end of the counted string.

: append        \ c-addr u $dest --
Add the string described by c-addr/u to the counted string at $dest.

: $+            \ $addr1 $addr2 --
Add the counted string $ADDR1 to the counted buffer at $ADDR2.

: s+            \ source dest --
Given the addresses of two counted strings, add the source string to the end of the destination string.

Zero-terminated strings

This section provides a set of simple words for handling zero-terminated strings. Additional words can be found in the tools layer.

create zNull    \ -- addr
Return the address of a zero terminated null string.

: zstrlen       \ addr -- len
Return the length of a 0 terminated string.

: zcount        \ zaddr -- zaddr len
A version of COUNT for zero terminated strings, returning the address of the first character and the length.

: zplace        \ caddr len zaddr --
Copy the string caddr/len to zaddr as a 0 terminated string.

: zmove         \ src dst -- ; shows off the optimiser
Copy a zero terminated string.

: zAppend       \ caddr len zdest --
Add the string defined by caddr/len to the end of the zero terminated string at zdest.

: Appendz       \ caddr len zdest --
OBSOLETE and REMOVED: use zAppend above instead.

: (z$+)         \ caddr u zdest$ --
Add the source string caddr/u to the end of the zero terminated destination string. OBSOLETE and REMOVED: use zAppend above instead.

: z$+           \ zsrc$ zdest$ -- ; add zsrc$ to end of zdest$
Add the source string to the end of the destination string. Both strings are zero terminated.

: zterm         \ caddr len -- caddr len
Zero terminate the given string.

: >zterm        \ caddr len -- z$
Convert a caddr/len string to a zero-terminated string.

: c>czterm      \ c$ -- z$
Convert a counted string in place to a counted and zero terminated string. The address of the zero-terminated section is returned.

: czplace       \ caddr len dest
Store the string caddr/len as a counted and zero-terminated string at dest. The strings must not overlap.

Pattern matching

VFX Forth provides a few words that check if a string matches a template string that can have simple wildcards. If you need something more sophisticated, you are probably best off interfacing to a regex library such as the one at

  www.pcre.org

Our thanks to Graham Smith at Tectime for the code.

Take two strings, a 'source' string and a 'pattern'. The test is to see if the source matches the pattern where the pattern can contain the wildcard characters '?' and '*'. These two characters can be 'escaped' using the character '\'.

The asterisk as a wildcard implies 'any of zero or more characters match'. Thus '*' will match with each of 'a', '12abxyz' and the zero length string ''. An asterisk then matches anything. A pattern of "ab*12" will match any text which starts with 'ab' and ends with '12'.

The question mark indicates any one character. Under DOS/Windows the question mark can also match zero characters but this behaviour seems inconsistant - see below for an example. The code here insists that a question mark matches exactly one of any one character. Thus '?' matches 'a', 'b' and '%'. It does not match the zero-length string ''.


Source           Pattern         Match
--------------   --------------  -----
"abc"            "abc"           yes
"abcd"           "abc"           no
"abc"            "abc*"          yes
"abc"            "abc?"          yes
"abc"            "*abc"          yes
"abc"            "?abc"          no
"ab"             "a?b"           no
"a"              "?"             yes
"123abc"         "*abc"          yes
"123abc"         "?abc"          no
"123abc"         "???abc"        yes
"123abc"         "1*c"           yes
""               ""              yes

: wcMatch?      \ src slen ptn plen -- t/f
Wild Card Match. src is the address of the start of a souce string and slen is its length. Similarly, ptn is the address of a pattern string and its length is plen. A value of TRUE is returned only if the source string matches the pattern according to the rules described above. The comparison is case sensitive.

: iwcMatch?     \ src slen ptn plen -- t/f
As wcMatch? above, but the comparison is case insensitive.

: strRmatch     \ *s lastS il *p lastP jl -- flag
Return true if the string described by addr last first matches the pattern described by a similar set of three parameters. In the set of three parameters (triple), addr is the start of the string, last is the zero-based index of the last character in the string. and first is the zero-based index of the first character. Originally coded as a primitive of $cstrmatch and $strmatch, this word now converts the two triples to the more standard Forth addr length doubles and calls wcMatch?.

: $cstrmatch    \ src srclen patt pattlen -- flag
A synonym for WCMatch?.

: $strmatch     \ src patt -- flag
Perform wcMatch? on two counted strings.

: zstrmatch     \ src patt -- flag
Perform wcMatch? on two zero-terminated strings.

DOS/Windows inconsistency

When using the wildcard character '?' in a file path/name matching routine in Windows or DOS, e.g. the DIR shell command, the question mark sometimes matches zero characters. For instance a pattern of 'ab?.*' matches the file nme 'ab.txt'. However, placing the question mark in another position causes the match to fail. For example, the pattern '?ab.*' does not match 'ab.txt'.

SYSPAD buffering

The SYSPAD mechanism replaces the use of PAD in the kernel. SYSPAD is built in the user area of each task and forms a circular buffer of strings. The lifetime of each string is not defined. It will last until another buffer request causes the memory to be reused.

: getSyspad     \ u -- addr
Reserve u bytes in the SYSPAD area and return the base address.

: >Syspad       \ caddr len -- caddr' len
Copy a string to SYSPAD and return the new string.

: >SyspadC      \ caddr len -- caddr'
Copy a string to SYSPAD and return the new counted string.

: >SyspadZ      \ caddr len -- zaddr
Copy a string to SYSPAD and return the new zero terminated string.

Formatted and Unformatted number conversion

Tools

: BELL          \ --
EMIT the ASCII '7' bell character. Not all output devices support this function. The USER variable OUT is not incremented by this word.

: SPACE         \ --                                            6.1.2220
Output a space (ASCII #32) character to the terminal.

: SPACES        \ n --                                           6.1.2230
Output 'n' spaces to the terminal, where n>0. For n<=0 no action is taken.

: >pos          \ +n --
Place cursor on current line to column n if possible.

: BS            \ --
Output a destructive backspace sequence to the terminal. If the cursor is not at column 0, ASCII characters 8, 32 and 8 are EMITted and the USER variable OUT is decremented by one.

: HEX           \ --                                             6.2.1660
Change current number conversion base to base 16.

: DECIMAL       \ --                                             6.1.1170
Change current number conversion base to base 10.

: BINARY        \ --
Change current number conversion base to base 2.

Numeric output

These words are used for displaying numbers.

: HOLD          \ char --                                       6.1.1670
Insert the ASCII 'char' value into the pictured numeric output string currently being assembled.

: HOLDS         \ caddr len --
Insert the string caddr/len into the pictured numeric output string currently being assembled.

: SIGN          \ n --                                          6.1.2210
Insert the ASCII 'minus' symbol into the numeric output string if 'n' is negative.

: #             \ ud1 -- ud2                                    6.1.0030
Given a double number on the stack this will add the next digit to the pictured numeric output buffer and return the next double number to work with. N.B. the output string is built from right (lsd) to left (msd).

: #S            \ ud1 -- ud2                                    6.1.0050
Keep performing # until all digits are generated.

: <#            \ --                                            6.1.0490
Begin definition of a new numeric output string buffer.

: #>            \ xd -- c-addr u                                6.1.0040
Terminate defnition of a numeric output string. Returns address and length of the ASCII result.

: .BYTE         \ b --
Display the byte b as a 2 digit hex number.

: .WORD         \ w --
Display the 16 bit word 'w' as a 4 digit hex number.

: .LWORD        \ dw --
Display the 32 bit long word 'dw' as an 8 digit hex number. The two groups of four digits are separated by a ':'.

: .DWORD        \ dw --
An 'Intel-ised' alias for .LWORD.

: .XWORD        \ dx --
Display the 64 bit xlong word 'dx' as an 16 digit hex number. The four groups of four digits are separated by ':' characters.

: .ASCII        \ char --
Output the supplied ASCII character 'char' via EMIT if it is a displayable character. Otherwise a period '.' is output.

: (u.)          \ u -- caddr len
Return the ASCII string corresponding to the unsigned number u.

: (.)           \ n -- caddr len
Create an ASCII string for the the signed number n.

: (u.r)         \ u +n -- caddr len
Return the string corresponding to the unsigned number u. The string is right aligned in a field +n characters wide.

: (u.r)         \ u +n -- caddr len
Return the string corresponding to the unsigned number u. The string is right aligned in a field +n characters wide.

: UD.R          \ ud n --
Output the unsigned double number 'ud' using the current BASE, right justified to 'n' characters. Padding is inserted using spaces on the left side.

: D.R           \ d n --                                       8.6.1.1070
Output the signed double number 'd' using the current BASE, right justified to 'n' characters. Padding is inserted using spaces on the left side.

: D.            \ d --                                         8.6.1.1060
Output the double number 'd' without padding.

: .             \ n --                                           6.1.0180
Output the cell signed value 'n' without justification.

: U.            \ u --                                           6.1.2320
As with . but treat as unsigned.

: U.R           \ u n --                                         6.2.2330
As with D.R but uses a single-unsigned cell value.

: .R            \ n1 n2 --                                       6.2.0210
As with D.R but uses a single-signed cell value.

Numeric input conversion

VFX Forth provides a flexible number conversion system. It is designed for application use as well as for compiling Forth source code.

The ANS and Forth200x Forth standards specify that floating point numbers must be entered in the form 1.234e5 and must contain a point '.' and 'e' or 'E'. Double numbers (integers) are terminated by a point '.'.

This situation prevents the use of the standard conversion words in international applications because of the interchangable use of the '.' and ',' characters in numbers. To ease this, VFX Forth uses two system variables, FP-CHAR and DP-CHAR, to hold the characters used as the floating point and double number integer indicator characters. By default, FP-CHAR is initialised to '.' and DP-CHAR is initialised to ',' and '.'. For ANS and Forth200x compliance, you should set them as follows:


\ ANS standard setting
  char . dp-char !
  char . fp-char !
: ans-floats    \ -- ; for strict ANS compliance
  [char] . dp-char !
  [char] . fp-char !
;
\ MPE defaults
  char , dp-char c!
  char . dp-char 1+ c!
  0 dp-char 2+ c!
  char . fp-char !
: mpe-floats    \ -- ; for VFX Forth v4.4 onwards
  [char] , dp-char c!
  [char] . dp-char 1+ c!
  0 dp-char 2+ c!
  [char] . fp-char !
;
: mpe-floats    \ -- ; for VFX Forth before v4.4
  [char] , dp-char !
  [char] . fp-char !
;

You can of course set these variables to any value that suits your application's language and locale. Note that integer conversion is always attempted before floating point conversion. This means that if FP-CHAR and DP-CHAR contain the same characters, floating point numbers must contain 'e' or 'E'. If they are different, a number containg a character in FP-CHAR will be successfully converted as a floating point number, even if it does not contain 'e' or 'E'.

: DIGIT         \ char base -- 0 | n true
If the ASCII value char can be treated as a digit for a number within the number conversion base base, i.e. in the range 0..base-1, then return the digit and a TRUE/-1 flag, otherwise return FALSE/0.

: SKIP-SIGN     \ addr1 len1 -- addr2 len2 t/f
Given the address and length of a string skip a leading plus or minus symbol and return modified address and length. The flag t/f is TRUE if a leading minus was found. From build 2514 onwards, conversion is case insensitive.

: +DIGIT        \ d1 n -- d2
Accumulate digit value n into double d1 to form d2 such that d2=d1*base+n.

: isSep?        \ char addr -- flag
Return true if char is one of the four bytes at addr. If less than than four bytes are needed, a zero byte acts as a terminator.

: +CHAR         \ char -- flag
The character char is not a digit, so check to see if it is another permitted character in a number such as a double number separator. Return true if char is valid.

: +ASCII-DIGIT  \ d1 char -- d2 flag
Accumulate the double number d1 with the conversion of char, returning true if the character is a valid digit or part of an integer.

: OverrideBase  \ caddr u -- caddr' u'
Used by isInteger? to force a BASE override. See isInteger? below for details.

: isInteger?    \ caddr len -- d 2 | n 1 | 0
Attempt to convert the string caddr/len to an integer. The return result is either 0 for failed, 1 for a single-cell integer return result above that cell or 2 above a double cell integer. The ASCII number string supplied can also contain number conversion base overrides. A leading $ enforces hexadecimal, a leading # enforces decimal and a leading % a leading '0x' or trailing 'h'. Character literals can be obtained with 'x' where x is the character. A double number contains one of the characters in the variable DP-CHAR, by default ',' and '.'.

: integer?      \ caddr -- d 2 | n 1 | 0
As isInteger? but takes a counted string.

: >NUMBER       \ ud1 c-addr1 u1 -- ud2 c-addr2 u2              6.1.0570
Accumulate digits from string c-addr1/u1 into double number ud1 to produce ud2 until the first non-convertible character is found. c-addr2/u2 represents the remaining string with c-addr2 pointing the non-convertible character. The number base for conversion is defined by the contents of USER variable BASE. From build 1656 onwards >NUMBER is case insensitive.

More string words

: $.            \ c-addr --
Output a counted string to the output device.

: (")           \ -- a-addr
OBSOLETE and REMOVED.

: ."            \ "ccc<quote>"  --
Output the text up to the closing double-quotes character.

variable ^null  \ -- *null
Return a "pointer-to-null" address.

: wcount         \ addr1 -- addr2 len
Given the address of a 16-bit word-counted string in memory WCOUNT will return the address of the first character and the length in characters of the string.

: (W")          \ -- waddr u ; step over caller's in line string
Returns the address and length of inline 16-bit word-counted and 16-bit zero-terminated string. Steps over the inline text to a cell-aligned boundary.

: ((W"))        \ -- waddr u ; dangerous factor!
A factor provided for the generation of long string actions that have to step over an inline string. For example, to define W." which uses a long string, you might compile (W.") and then use W", to compile the inline string. The definition of (W.") then might be:


: (W.")     \ --
  ((W")) type
;

Linked lists

: link,         \ var-addr -- ; lay a link in a chain whose head is at var-addr
Add a link to a chain anchored at address var-addr. The old contents of var-addr are added to the dictionary as the new link, and the address of the new link is placed at var-addr.

: AddLink       \ item anchor -- ; add a new item to end of chain, link is first
Used instead of LINK, when a new item in the chain already exists, e.g. it has been ALLOCATEd. The item is added to the start of the chain. Note that this word requires the link to be at offset 0 in the item being added.

: AddEndLink    \ item anchor --
Add an item (a structure) to the end of of the chain anchored at anchor. The link field must be at offset 0 in item.

: DelLink       \ item anchor -- ; remove item from chain
Delete/Remove an item from a chain achored at address anchor. Note that this word requires the link to be at offset 0 in the item being removed.

: ExecChain     \ anchor --
Execute the contents of chain with the following structure:

  link | xt | ...

Each word that is run has the stack effect

  ^link -- ^link

Where ^link is the address of the link field in the structure. Thus, data that follows the xt can easily be accessed.

: AtExecChain   \ xt anchor --
Add the word whose xt is given to the chain anchored at address anchor.

: ShowChain     \ anchor --
Display the names of the words in the chain. If the word is headerless, the name of the first header before it will be shown.

Wordlists and Vocabularies

Wordlists and vocabularies are described in a separate chapter.

Input Specification and Parsing

The Forth interpreter operates on a "terminal input buffer". This buffer is parsed space-delimited token by token by the system. Standard words exist for managing the source of the text.

0 value SOURCE-ID                               \                6.2.2218
SOURCE-ID describes the method used to refill the terminal input buffer. If the value is "0" the input source is the console, a value of "-1" indicates the input source is a string - via EVALUATE - any other value is taken to be a file-id for source inclusion from a text file.

: TIB           \ -- c-addr                                     6.2.2290
Returns the address of the terminal input buffer. Note that tasks requiring user input must initialise the USER variable 'TIB. New code should use SOURCE and TO-SOURCE instead for ANS Forth compatibility.

tib-len constant tib-len        \ -- u
Returns the size of the console input buffer.

: SOURCE        \ -- c-addr u                                   6.1.2216
Returns the address and length of the current terminal input buffer contents.

: TO-SOURCE     \ c-addr u --
Set the address and length of the system terminal input buffer.

: SAVE-INPUT    \ -- xn..x1 n                                    6.2.2182
Save all the details of the input source onto the data stack. If it later becomes necessary to discard the saved input, NDROP will do the job. If you want to move the data to the return stack, N>R and NR> are available.

: RESTORE-INPUT \ xn..x1 n -- flag                               6.2.2148
Attempt to restore input specification from the data stack. If the stack picture between SAVE-INPUT and RESTORE-INPUT is not balanced, a non-zero is returned in place of n. On success a 0 is returned.

: QUERY         \ --                                             6.2.2040
Reset the input source specification to the console and accept a line of text into the input buffer.

: REFILL        \ -- flag                                        6.2.2125
Attempt to refill the terminal input buffer from the current source. This may be a file or the console. An attempt to refill when the input source is a string will fail. The return result is a flag indicating success with TRUE and failure with FALSE. A failure to refill when the input source is a text file indicates the end of file condition.

: PARSE         \ char"ccc<char>" -- c-addr u                    6.2.2008
Parse the next token from the terminal input buffer using <char> as the delimiter. The next token is returned as a c-addr/u string description. Note that PARSE does not skip leading delimiters. If you need to skip leading delimiters, use PARSE-WORD instead.

: PARSE-WORD    \ char -- c-addr u
An alternative to WORD below. The returned string is a c-addr/u pair rather than a counted string and no copy has occured, i.e. the contents of HERE are unaffected. The returned string is in the input buffer, which should not be modified. Because no intermediate global buffers are used PARSE-WORD is more reliable than WORD for text scanning in multi-threaded applications and in callbacks.

: parse-name    \ -- c-addr u ; Forth200x
Equivalent to BL PARSE-WORD above. Do not modify the returned string if you want to be compliant with the ANS or Forth-2012 standards. PARSE-NAME can replace BL WORD COUNT in most cases. Because no intermediate global buffers are used PARSE-NAME is faster and more reliable than WORD for text scanning in multi-threaded applications and in callbacks.

: WORD          \ char"<chars>ccc<char>" -- c-addr               6.1.2450
Similar behaviour to the ANS PARSE definition but the returned string is described as a counted string which is found at HERE.

: parse-leading \ char --
skip over leading characters of char in the input stream. Tab characters are treated as spaces.

: GET-TOKEN     \ "<name>" -- addr
A version of BL WORD in which the returned string is converted to upper case.

: next-name     \ -- c-addr u
A version of parse-name that works across multiple lines. If a name cannot be obtained, the input stream is REFILLed.

: get-word      \ char -- c-addr
A version of WORD that works across multiple lines. If a word cannot be obtained, the input stream is REFILLed.

: GetPathSpec   \ -- c-addr u | c-addr 0 ; 0 if null string
Parse the input stream for a file/path name and return the address and length. If the name starts with a '"' character the returned string contains the characters between the first and second '"' characters but does not include the '"' characters themselves. If you need to include names that include '"' characters, delimit the string with '(' and ')'. In all other cases a space is used as the delimiting character. GetPathSpec does not expand text macro names.

: "xxx"         \ "xxx" -- caddr len
Parse a string enclosed by quotes from the input stream, e.g.

  "Quoted string"

Support for constructing words

defer DOCOLON,          \ --
Compile the code required at entry to a colon definitions.

defer DOSEMICOLON,      \ --
Compile the code required at exit from a colon definitions by ;.

defer Compile,          \ xt --                         6.2.0945
Compile the word specified by xt into the current definition. Only for "normal" words that are not NDCS.

defer ndcs,             \ xt --
Perform the compilation action of an NDCS word. This may have a stack effect or parse the input stream.

: compile-word  \ i*x xt -- j*x
Process an XT for compilation.

: (;CODE)       \ -- ; R: a-addr --
Part of the run time action of ;CODE and DOES>, executed when the defining word executes to create a new child word. Patch the last word defined (by CREATE) to have the run time action that follows immediately after (;CODE).

: DOCREATE,     \ --
Compile the run time action of CREATE.

: (ndcs,)       \ i*x xt -- j*x
Like (COMPILE,) but executes the NDCS action for a word and may parse and/or have a stack effect during compilation.

: LIT           \ -- x
Code which when CALLED at runtime will return an inline cell value.

#16 value /code-alignment       \ -- n
The default code alignment used by FASTER below. Must be a power of two.

#16 value /data-alignment       \ -- n
The default data alignment used by FASTER below. Must be a power of two.

/code-alignment value code-alignment    \ -- n
The start of a colon or CODE definition is aligned to an alignment boundary defined by this value, which must be a power of two.

/data-alignment value data-alignment    \ -- n
The start of the data areas defined by CREATE and friends is aligned to a boundary defined by this value, which must be a power of two.

: smaller       \ --
Selects smaller code using the minimum of alignment.

: faster        \ --
Selects faster code using the preset alignment in /CODE-ALIGNMENT, which will usually increase speed and the size of the dictionary headers.

: CODE-ALIGN    \ --
ALIGN filling with breakpoints (used for code boundaries).

: data-align    \ --
ALIGN filling with breakpoints (used for data boundaries). The alignment is followed by the run-time code for CREATE and the data area is then aligned on the boundary.

: set-compiler  \ xt --
Set xt as the compiler of the LATEST definition. The word whose xt is given to SET-COMPILER receives the xt of the word it is to compile ( xt -- ). This is done so that information can be extracted from the word. If you use this in a defining word use INTERP> rather DOES>. See the VFX code generator section of the manual for more details.

: get-compiler  \ -- xt
Get xt of the compiler of the LATEST definition. If the return value is zero, the word has no compiler.

Defining words

These are word involved in the construct of new words.

: (:)           \ C: caddr len -- colon-sys ; Exec: i*x -- j*x ; R: -- nest-sys
Begin a new colon definition with the name given by caddr/len.

: :             \ C: "<spaces>name" -- colon-sys ; Exec: i*x -- j*x ; R: -- nest-sys 6.1.0450
Begin a new definition called name.

: :NONAME       \ C: -- colon-sys ; Exec: i*x -- j*x ; R: -- nest-sys  6.2.0455
Begin a new colon definition which does not have a name. After the definition is complete the semi-colon operator returns the XT of the newly compiled code on the stack.

: ;             \ C: colon-sys -- ; Run: -- ; R: nest-sys --    6.1.0460
Complete the definition of a new 'colon' or :NONAME word.

: DOES>         \ C: colon-sys1 -- colon-sys2 ; R: nest-sys --  6.1.1250
Begin definition of the runtime-action of a child of a defining word. You may not use RECURSE after DOES>.

: INTERP>         \ C: colon-sys1 -- colon-sys2 ; R: nest-sys --
Begin definition of the runtime-action of a child of a defining word that sets a compiler with SET-COMPILER for its children between CREATE and INTERP>. You may not use RECURSE after INTERP>. INTERP> and setCompiler are used to avoid defining words with state-smart run-time actions.

: COMP:         \ --
Start a :NONAME word that is the compiler for the previous word. When executed, the:NONAME word is passed the xt of the word that is being compiled. NOTE that the COMP: word must not contain the word it is applied to because the word would be compiled before its compilation word has been completed.

: >DOES         \ xt -- addr
Given the xt of the child of a defining word, return the address of the run-time code.

: Synonym       \ "<new-name>" "<curdef>" --
Create a new definition which redirects to an existing one. Normal dictionary searches for <new-name> will return the xt of <curdef>.

: Alias:        \ <"new-name"> <"curdef"> --
A synonym for SYNONYM.

: CONSTANT      \ x "<spaces>name"  -- ; Exec: -- x             6.1.0950
Create a new CONSTANT called name which has the value x. When NAME is executed the value is returned.

: 2constant     \ n1 n2 -- ; Exec -- n1 n2                     8.6.1.0830
A double number equivalent of CONSTANT.

: VARIABLE      \ "<spaces>name" -- ; Exec: -- a-addr           6.1.2410
Create a new variable called name. When Name is executed the address of the data-cell is returned for use with @ and ! operators.

: 2VARIABLE     \ "<spaces>name" -- ; Exec: -- a-addr         8.6.1.0440
A double number equivalent of VARIABLE.

: user          \ u "<name>" -- ; Exec: -- addr
Create a new USER variable called name. The 'u' parameter specifies the index into the user-area table at which to place the A -405 THROW occurs if there is no more user space. The VFX kernel supports 4K bytes of USER area space starting at offset 4096. USER variables are located in a separate area of memory for each task or callback procedure. They are equivalent to "thread local storage" in Windows parlance. Use in the form:

  $1000 USER TaskData

: +USER         \ n "<spaces>name" -- ; Exec: -- user-a-addr
Create a new USER variable called name and reserve N bytes of USER space, e.g. 8 CELLS +USER TaskStruct. N is rounded up to the next CELL boundary. See USER above. The use of +USER avoids having to keep track of assigned USER variable offsets. ) +USER is non-ANS but for portability is trivially defined by:


VARIABLE NEXTUSER
: +USER         \ n -- ; -- addr
  NextUser @ user  aligned NextUser +!
;

: u#            \ "<name>"-- u
Return the index of the USER variable whose name follows, e.g.

  u# S0

: Buffer:       \ n "name" -- ; [child] -- addr
Create a memory buffer called name which is 'n' bytes long. When name is executed the address of the buffer is returned.

: value         \ n -- ; ??? -- ??? ; 6.2.2405
Create a variable with an initial value. When the VALUE's name is referenced, the value is returned. Precede the name with TO or -> to store to it. Precede the name with ADDR to get the address of the data. The full list of operators is displayed by .OPERATORS ( -- ).


5 VALUE FOO                \ initial value of FOO is 5
  FOO .                    \ will give 5
  6 TO FOO                 \ new value is 6
  FOO .                    \ will give 6
  ADDR FOO @ .             \ will give 6

: 2value        \ x1 x2 -- ; ??? -- ??? ; 6.2.2405
Create a cell pair with an initial value. When the 2VALUE's name is referenced, the value is returned. Precede the name with TO or -> to store to it. Precede the name with ADDR to get the address of the data.

: operator      \ n --
Define an operator with the given number.

: Operator:     \ --
Define a new operator with automatic numbering.

: op#           \ "name" -- n [int] ; "name" -- [comp]
Return or compile the operator number

: .Operators    \ --
List the operators by number and name.

The standard VFX Forth set of operators is as follows. All of them are supported by children of VALUE, but not all are supported by other words that use operators.


 0 operator default      \ fetch
 1 operator ->           \ store
 1 operator to           \   "
 2 operator addr         \ address operator
 3 operator inc          \ increment by one
 4 operator dec          \ decrement by one
 5 operator add          \ add stack item to contents
 6 operator zero         \ set to zero
 7 operator sub          \ subtract stack item from contents
 8 operator sizeof       \ return item size
 9 operator set          \ set to -1

The following are provided to ease porting from other systems.


 5 operator +to          \ add stack item to contents
 7 operator -to          \ subtract stack item from contents

: DEFER         \ Comp: "<spaces>name" -- ; Run: i*x -- j*x
Creates a new DEFERed word. A default action, CRASH, is automatically assigned. See CRASH and the section on vectored execution.

Compilation tools

These words are mostly used for building new interpreting and compiling words

: !CSP          \ x --
Mark the position of the compilation stack pointer for later compile time checking.

: ?CSP          \ --
Check that the compilation stack pointer is the same as when last marked by !CSP.

: ?EXEC         \ --
Perform #-403 THROW if not in interpretation state.

: ?COMP         \ --
Perform -14 THROW if not in compilation state.

: ?STACK        \ --
Perform -4 THROW if the data stack pointer is out of range.

: ?UNDEF        \ flag --
Perform -13 THROW if flag is false/0, usually because a word is undefined.

: [             \ --                                            6.1.2500
Switch compiler into interpreter state.

: ]             \ --                                            6.1.2540
Switch compiler into compilation state.

Literal tools

: LITERAL       \ Comp: x -- ; Run: --  x                       6.1.1780
Compile a literal into the current definition. Usually used in the form

  [ <expression ] LITERAL

inside a colon definition. Note that LITERAL is IMMEDIATE.

: DLITERAL      \ Comp: d -- ; Run: -- d
A double number version of LITERAL.

: 2LITERAL      \ Comp: x1 x2 -- ; Run: -- x1 x2              8.6.1.0390
A two cell version of LITERAL.

: DoIsNumber?     \ caddr len -- Nn .. N1 n | 0
Wrapper for isNumber? Used by the system to add the XREF hook for literals. See isNumber?.

Finding xts

: '             \ "<spaces>name" -- xt                          6.1.0070
Find the xt of the next word in the input stream. An error occurs if the xt cannot be found.

: [']           \ Comp: "<spaces>name" -- ; Run: -- xt          6.1.2510
Find the xt of the next word in the input stream, and compile it as a literal. An error occurs if the xt cannot be found.

: 'syn          \ "<spaces>name" -- xt
Find the xt of the next word in the input stream. Unlike ' above, if the word is a child of SYNONYM, the xt of the SYNONYM is returned, not the xt of the original word.

defer Compile,  \ xt --                                         6.2.0945
Compile the word specified by xt into the current definition.

: EXECUTE       \ xt --                                         6.1.1370
Execute the word specified by xt.

: [COMPILE]     \ "<spaces>name" -- ; 6.2.2530
Compile the compilation action of the next word in the input stream. [COMPILE] ignores the IMMEDIATE state of the word. Its operation is mostly superceded by POSTPONE. See also [INTERP] below.

: [INTERP]      \ "<spaces>name" --
Compile the interpretation action of the next word in the input stream. [INTERP] is necessary when you want the interpretation behaviour of words such as S" to be compiled. See also [COMPILE] above.

Parsing strings and characters

: $,            \ caddr len --
Lay the string into the dictionary at HERE, reserve space for it and ALIGN the dictionary.

: ",            \ "ccc<quote>"  --
Parse text up to the closing quote and compile into the dictionary at HERE as a counted string. The end of the string is aligned.

: ,"            \ "ccc<quote>"  --
An alias for ", added because it is in common use.

: S"            \ Comp: "ccc<quote>" -- ; Run: -- c-addr u      6.1.2165
Describe a string. Text is taken up to the next double-quote character. The address and length of the string are returned.

: C"            \ Comp: "ccc<quote>" -- ; Run: -- c-addr        6.2.0855
As S" except the address of a counted string is returned.

: Z"            \ Comp: "ccc<quote>" -- ; Run: -- c-addr
A Version of C" which returns the address of a zero-terminated string.

create EscapeTable      \ -- addr
Table of translations for \a..\z.

: parse\"       \ caddr len dest -- caddr' len'
Parses a string up to an unescaped '"', translating '\' escapes to characters much as C does. The returned translated string is a counted string at dest The supported escapes (case sensitive) are:

\a

BEL (alert)

\b

BS (backspace)

\e

ESC (escape, ASCII 27)

\f

FF (form feed, ASCII 12)

\l

LF (ASCII 10)

\m

CR/LF pair - for HTML etc.

\n

newline - CR/LF for Windows/DOS, LF for Unices

\q

double-quote

\r

CR (ASCII 13)

\t

HT (tab, ASCII 9)

\v

VT

\z

NUL (ASCII 0)

\"

"

\[0-7]+

Octal numerical character value, finishes at the first non-octal character

\x[0-9a-f][0-9a-f]

Two digit hex numerical character value.

\\

backslash itself

\

before any other character represents that character

: readEscaped   \ "string" -- caddr
Parses an escaped string from the input stream according to the rules of parse\" above, returning the address of the translated counted string.

: \",           \ "string" --
Parse text up to the closing quote and compile into the dictionary at HERE as a counted string. The end of the string is aligned.

: .\"            \ "ccc<quote>"  --
As .", but translates escaped characters using parse\" above.

: S\"           \ "string" -- caddr u
As S", but translates escaped characters using parse\" above.

: C\"           \ "string" -- caddr
As C", but translates escaped characters using parse\" above

: Z\"            \ "string" -- z$
As Z", but translates escaped characters using parse\" above

: z\",          \ "cc<quote>" --
Parse text up to the closing quote and compile into the dictionary at HERE as a zero terminated string. The end of the string is not aligned.

: CHAR          \ "<spaces>name" -- char                        6.1.0895
Return the first character of the next token in the input stream. Usually used to avoid magic numbers in the source code.

: [CHAR]        \ Comp: "<spaces>name" -- ; Run: -- char        6.1.2520
Compile the first character of the next token in the input stream as a literal. Usually used to avoid magic numbers in the source code.

: SLITERAL      \ comp: c-addr1 u -- ; Run: -- c-addr2 u     17.6.1.2212
Compile the string c-addr1/u into the dictionary so that at run time the identical string c-addr2/u is returned. Note that because of the use of dynamic strings at compile time the address c-addr2 is unlikely to be the same as c-addr1.

Comments

: \             \ "ccc<eol>" -- ; 6.2.2535
Begin a single-line comment. All text up to the end of the line is ignored.

: (             \ "ccc<paren>"  -- ; ( ... ) ; 6.1.0080
Begin an inline comment. All text up to the closing bracket is ignored. In VFX Forth, the comment may extend over several lines.

: .(            \ "cc<paren>"  -- ; .( ... )                    6.2.0200
A documenting comment. Behaves in the same manner as ( except that the enclosed text is written to the console.

: ParseUntil    \ c-addr u --
Parse the input stream for a white-space delimited string, REFILLing as necessary until the string is found or input is exhausted. Mostly used for block comments. The string compare is case insensitive.

: ((            \ -- ; (( ... ))
Block comment operator. Any source following this is ignored upto and including the terminator, '))', which must be white space separated.

: (*            \ -- ; (* ... *)
Block comment operator. Any source following this is ignored upto and including the terminator, '*)', which must be white space separated.

: #!            \ -- ; #! /bin/bash
Begin a single-line comment. All text up to the end of the line is ignored. This form is provided for Unix-based systems whose shells use #! to specify the program to use with the file.

: StopIncluding \ --
Used in a source file to skip the rest of a file, otherwise behaves like \.

: \\            \ --
A synonymm for StopIncluding above.

Generic stack get/set

What is called a stack here is actually a table whose first element contains the number of items in the stack. The 'top' of the stack is the last entry in the table.

  n | xn | ... | x1 |

#32 constant /max-stack \ -- u
Maximum number of items in a stack.

: stack:        \ "<name>" -- ; -- stack
Create a stack of /max-stack elenents. At run time

  stack: rec-stack1

: get-stack     \ stack -- x1 ... xn n
Retrieve the contents of the stack, where xn is the top of the stack.

: set-stack     \ x1 ... xn n stack --
place n items on the empty stack, where xn is the top of the stack.

: -stack        {: x astack -- :}
Remove all instances of the given item from the given stack.

: +stack        \ x astack --
The given item is added as the top of the stack. Duplicate entries are removed.

: +stack-bot    \ x astack --
The given item is added at the bottom of the stack. Duplicate entries are removed.

Text interpreter

From VFX 5.1 onwards, the text interpreter is built around recognisers. The major advantage of recongnisers is that they provide an extensible interpreter, and in particular they permit different OOP packages to be installed and removed at will, so permitting the various OOP packages to coexist in one application.

Recognisers are a technique to allow the Forth text interpreter to be extended and reconfigured for application purposes. Text items are parsed to see if they fit into one or another type. An action for the appropriate type is then performed.

A recogniser for a particular type of data consists of a parsing word which determines whether the string matches a particular data type. Associated with each parsing word is a data structure that holds the interpret, compile and postpone xts for that data type. On a match, the relevant xt is executed.

Applications use a set of recognisers as required. In VFX, minimum set is usually the Forth word finder (dictionary look up) and the literal handler (single, double, float). A set of recognisers is held as a table of xts of the parsing words.

Recognizer type structure

The recognizer type structure is the interface between the type information returned by the parser and the actions (xts) contained in the type structure.

: RecType:      \ xtint xtcomp xtpost "name" -- ; -- struct
Create a recogniser data structure associated with the three actions (interpret, compile, postpone) associated with its data type.

: RECTYPE>INT  ( rectype-token -- xt-interpret)             @ ;
Given a recogniser structure, return the interpretation xt for it. Execution of the xt converts the data returned by the parser into the form returned by the text interpreter.

: RECTYPE>COMP ( rectype-token -- xt-compile  )       cell+ @ ;
Given a recogniser structure, return the compilation xt for it. Execution of the xt compiles the data returned by the parser into the form used inside a colon definition.

: RECTYPE>POST ( rectype-token -- xt-postpone ) cell+ cell+ @ ;
Given a recogniser structure, return the postpone xt for it.

Word and number recognition

: ]]            \ --
Switch the compiler into POSTPONE state. All words between ]] and [[ are POSTPONEd. The code below is evalent to postpone dup postpone 5 postpone 6.

: t  ]] dup 5 6 [[  ;

The ]] ... [[ notation is experimental and my be removed in a future version

: [[
Switch the compiler out of POSTPONE state.

: Undefined     \ c-addr u --
Default action taken by compiler when a parsed token is not recognised as a word or number.

' undefined dup dup RecType: r:fail     \ -- struct
Contains the three actions for unrecognised words, i.e. the fail case.

' execute ' compile, ' postnorm RecType: r:word
Contains the three actions for non-immediate words.

' execute ' ndcs, ' postndcs RecType: r:ndcs
Contains the three actions for NDCS words.

' execute ' execute ' postimm RecType: r:immediate
Contains the three actions for immediate words.

: rec-find      \ addr u -- xt r:word | r:fail
Searches a word in the search order (wordlist stack). The xref utility code is contained inside the dictionary search code.

: xlit,         \ x1 .. xn n --
Lay a numeric literal.

: post-xlit     \ x1 .. xn n --
Postpone a numeric literal.

' drop  ' xlit,  ' post-xlit RecType: r:num     \ -- type
Contains the three actions for numbers, including single integers, double integers and floating point numbers.

: rec-num       \ addr u -- n 1 r:num | d 2 r:num | r: -2 r:num | r:fail
The parsing portion that checks for a literal number.

Main recognizer and text interpreter

/max-stack 1+ cells buffer: main-recognizer     \ -- stack
The default stack used for recognizers.

main-recognizer value forth-recognizer
Set the current recognizer.

: get-recognizers       \ -- xt1 ... xtn n
Return the content of the recognizer stack.

: set-recognizers       \ xt1 ... xtn n
Set the recognizer stack from the data stack.

: recognize     \ caddr len stack -- tokens data-type | fail-type
Apply a recognizer stack to a string, delivering optional tokens and a data type indicator.

: parser        \ addr u -- i*x xt
Pass the string to the current recognizer stack and extract the xt needed to process it.

: page-check    \ --
For legacy reasons, the VFX INCLUDE allows files with page breaks (ASCII form feed character) at the start of the line and replaces them with spaces.

: (interpret)   \ --
The default action of INTERPRET.

: postpone      \ "<name>" -- ; POSTPONE <name> ; 6.1.2033
Append the compilation semantics of name to the current definition, i.e. the one containing POSTPONE <name>. For a normal word, the current definition compiles <name>. For an immediate word, the current definition will execute <name> rather than it executing immediately. POSTPONE delays the execution of a word by one time frame.

: EVALUATE      \ i*x c-addr u -- j*x                         6.1.1360
Process the supplied string as though it had been entered via the interpreter.

: assess        \ i*x c-addr u -- j*x
A version of EVALUATE that saves the current state, switches to interpret mode, interprets the string and then restores state.

: init-quit     \ --
Perform the set up required before entering the text interpreter.

defer QuitHook  \ --
A place holder for user defined clean up actions after a THROW) occurs in *\fo{QUIT.

: reset-stacks  \ ?? -- ; F: ?? --
Reset the data and floating point stacks.

DEFERred words and Vectored Execution

A DEFERred word is defined at one point in the source and can have its action ASSIGNed later both during compile time and at execution time. It is similar to a VARIABLE which has @ EXECUTE appended to its execution semantics.

DEFER words are used to

: CRASH         \ --
The default action of a DEFERed word. CRASH will THROW a code back to the system.

: DEFER         \ Comp: "<spaces>name" -- ; Run: i*x -- j*x
Creates a new DEFERed word. A default action, CRASH, is automatically assigned.

: IS            \ xt "<spaces>name" -- ; Forth200x
The second part of the ' xxx IS yyy construct. IS assigns the given XT to be the action of a DEFERed word yyy which is named in the input stream.

: ASSIGN        \ "<spaces>name"  -- xt
A state smart word to get the XT of a word. The source word is parsed from the input stream. Used as part of a ASSIGN xxx TO-DO yyy construct.

: TO-DO         \ xt "<spaces>name"  --
The second part of the ASSIGN xxx TO-DO yyy construct. TO-DO assigns the given XT to be the action of a DEFERed word which is named in the input stream.

: action-of     \ "<name" -- xt ; Forth200x
Returns the xt of the current action of the DEFERred word whose name is given. Use in the form ACTION-OF <deferred-word> if you need to save and later restore the action of a word. The xt returned by ACTION-OF can be used by TO-DO.

: BEHAVIOR      \ "<spaces>name" -- xt
Returns the xt of the current action of the DEFERred word whose name is given. Since BEHAVIOR is just a synonym for ACTION-OF, OBSOLETE and REMOVED.

: DEFER@        \ xt1 -- xt2                            Forth200x
Given xt1, the xt of a DEFERred word, return xt2, the action of xt1.

: DEFER!        \ xt1 xt2 --                            Forth200x
Xt1 becomes the action of the DEFERred word defined by xt2.

Time and Date

0 value dow     \ -- dow ; 0=Sunday
Returns the local day of the week, starting at 0=Sunday. This value is updated when TIME&DATE below is called.

: time&date     \ -- seconds mins hours day month year
Return the operating system local time and date, and set DOW as a side effect.

0 value SysDow  \ -- dow ; 0=Sunday
Returns the system day of the week, starting at 0=Sunday. This value is updated when SYSTIME&DATE below is called.

: systime&date     \ -- seconds mins hours day month year
Return the operating system local time and date, and set SYSDOW as a side effect.

Millisecond timing

Most timing in VFX Forth application uses a millisecond timer provided by the host operating system. The words provided are compatible with those used by MPE's embedded systems. The primary word is ticks which returns a time in milliseconds.

defer ms        \ n --
Wait for n milliseconds. Calls the multitasker through PAUSE.

defer ticks     \ -- n
Return the system timer value in milliseconds. Treat the returned value as a 32 bit unsigned number that wraps on overflow.

: later         \ n -- n'
Generates a time value for termination in n milliseconds time. Because many applications use a timer value of zero to indicate that a timer is not in use, later never returns a value of zero, and always forces the bottom bit of n' to be set to 1.

: expired       \ n -- flag ; true if timed out
Flag is returned true if the time value n has timed out. Calls PAUSE.

: timedout?     \ n -- flag ; true if timed out
Flag is returned true if the time value n has timed out. Does not call PAUSE, so timedout? can be used in callbacks. In particular, TIMEDOUT? should be used rather than EXPIRED inside timer action words to reduce timer jitter.

Heap - Runtime memory allocation

The heap memory access wordset is compliant with the ANS Standard. The heap is provided and managed by the host operating system and is only limited by the available memory and/or maximum paging file size. See the later paragraphs for implementation-specific details.

defer allocate  \ size -- a-addr ior
Allocate SIZE address units of contiguous data space. If successful an aligned pointer and a 0 IOR are returned. On failure the A-ADDR item is invalid and a non-0 IOR is returned. The contents of newly allocated heap memory are undefined.

defer resize    \ a-addr newlen -- a-addr ior
Attempt to resize a block of allocated heap memory to newlen size in address units. The contents of the memory block are preserved on a successful resize operation but the address of the memory block may change depending on heap load and the type of resizing requested.

defer free      \ a-addr -- ior
Attempt to release allocated memory at A-ADDR back to the system. IOR will return as 0 on success or non-zero for failure.

: ProtAlloc     \ n -- addr
A protected version of ALLOCATE which THROWs on failure.

: ProtFree      \ addr --
A protected version of FREE which does nothing if addr=0, and THROWs on failure.

From VFX Forth 4.0 onwards, the heap system has changed. ALLOCATE, FREE and RESIZE are now directly DEFERred to use operating system dependent words.

Under Windows the new heap is much faster but is far less tolerant of programming errors. In particular, releasing the same block twice or FREEing memory you did not ALLOCATE may/will lead to a crash with the crash screen showing a fault outside VFX Forth. Newly allocated memory is zeroed and executable.

The Linux man page for malloc() says:

"Crashes in malloc(), free() or realloc() are almost always related to heap corruption, such as overflowing an allocated chunk or freeing the same pointer twice."

The SYSTEM vocabulary contains INITVFXHEAP ( -- ) and TERMVFXHEAP ( -- ) which initialise and destroy the heap. They are in the cold and exit chains. Note that if you are generating a DLL or shared library, these words must be explicitly run as the cold and exit chains are not run before DLLMAIN.

Nested definitions

Quotations provide nested colon definitions, in which the inner definition(s) are nameless. The expression:


  : foo ... [: some words ;] ... ;

is equivalent to:


:noname some words ; Constant #temp#
: foo ... #temp# ... ;

A simple quotation is an anonymous colon definition that is defined inside a colon definition or another quotation. It has no access to locals of the enclosing definitions. Quotations can use local variables and RECURSE.

A good example use of quotations is to provide a solution to the use of CATCH in a form like the TRY ... EXCEPT blocks of other languages.


: foo     \ i*x -- j*x
  setup
  [: fee fi fo fum ;] catch
  if ... then
  teardown
  ( throw again )
;

: [:            \ comp: -- i*x orig colon-sys
Compilation: suspends compiling to the current definition, starts a new nested definition, and compilation continues with this nested definition. Outer locals are not visible in the nested definition. Locals may be defined in the nested definition. Inside the nested definition RECURSE applies to the nested definition.

: ;]            \ comp: i*x orig colon-sys -- ; run-time: -- xt
Compilation: Ends the current nested definition, and resumes compilation to the previous current definition. At run-time the xt of the nameless definition is returned.