VFX Forth has a built-in assembler. This is to enable you to write time-critical definitions - if time is a constraint - or to do things that might perhaps be more difficult in Forth - things such as interrupt service routines. The assembler supports the 80386 and the 80387 chips. Definitions written in assembler may use all the variables, constants, etc. used by the Forth system, and may be called from the keyboard or from other words just like any Forth high-level word. It is important when writing a code definition to remember which machine registers are used by the Forth system itself. These registers are documented later in this chapter. All other registers may be used freely. The reserved registers may also be used - but their contents must be preserved while they are in use and reset afterwards.
The assembler mnemonics used in the Forth assembler are just the same as those documented in the Intel literature. The operand order is also the same. The only difference is the need for a space between each portion of the instruction. This is a requirement of the Forth interpreter.
The assembler has certain defaults. These cover the order of the operands, the default addressing modes and the segment size. These are described later in this chapter.
Normally the assembler will be used to create new Forth words
written in assembler. Such words use CODE
and
END-CODE
in place of :
and ;
or CREATE
and ;CODE
in place of CREATE
and DOES>
.
The word CODE
creates a new dictionary header and enables
the assembler.
As an example, study the definition of 0<
in assembly
language. The word 0<
takes one operand from the stack
and returns a true value, -1, if the operand was less than
zero, or a false value, 0, if the operand was greater than
or equal to zero.
CODE 0< \ n - t/f ; define the word 0<
OR EBX, EBX \ use OR to set flags
L, \ less than zero ?
IF, \ y:
MOV EBX, # -1 \ -1 is true flag
ELSE, \ n:
SUB EBX, EBX \ dirty set to 0
ENDIF,
NEXT, \ return to Forth
END-CODE
Notice how the word NEXT,
is used. NEXT,
is a
macro that assembles a return to the Forth inner interpreter.
All code words must end with a return to the inner interpreter.
The example also demonstrates the use of structuring words
within the assembler. These words are pre-defined macros
which implement the necessary branching instructions. The
next example shows the same word, but implemented using local
labels instead of assembler structures for the control
structures.
CODE 0< \ n - t/f ; define the word 0<
OR EBX, EBX \ use OR to set flags
JGE L$1 \ skip if AX>=0
MOV EBX, # -1 \ -1 is true flag
JMP L$2 \ this part done
L$1: \ do following otherwise
SUB EBX, EBX \ dirty set to 0
L$2:
NEXT, \ return to Forth
END-CODE
There are several useful words provided within VFX Forth to control the use of the assembler.
;code \ --
Used in the form:
: <namex> CREATE .... ;CODE ... END-CODE
Stops compilation, and enables the assembler. This word is
used with CREATE
to produce defining words whose run-time
portion is written in code, in the same way that CREATE ... DOES>
is used to create high level defining words.
The data structure is defined between CREATE
and
;CODE
and the run-time action is defined between
;CODE
and END-CODE
. The current value of the
data stack pointer is saved by ;CODE
for later use by
END-CODE
for error checking.
When <namex>
executes the address of the data area will
be the top item of the CPU call stack. You can get the address
of the data area by POP
ing it into a register.
A definition of VARIABLE
might be as follows:
: VARIABLE
CREATE 0 ,
;CODE
sub ebp, 4
mov 0 [ebp], ebx
pop ebx
next,
END-CODE
VARIABLE TEST-VAR
CODE \ --
A defining word used in the form:
CODE <name> ... END-CODE
Creates a dictionary entry for <name>
to be defined by
a following sequence of assembly language words. Words defined
in this way are called code definitions. At compile-time,
CODE
saves the data stack pointer for later error
checking by END-CODE
.
END-CODE \ --
Terminates a code definition and checks the data stack pointer
against the value stored when ;CODE
or CODE
was
executed. The assembler is disabled. See: CODE
and
;CODE
.
LBL: \ --
A defining word that creates an assembler routine that can be called from other code routines as a subroutine. Use in the form:
LBL: <name>
...code...
END-CODE
When <name>
executes it returns the address of the first
byte of executable code. Later on another code definition can
call <name>
or jump to it.
The Forth virtual machine is held within the processor register set. Register usage is as follows:
EAX |
scratch |
EBX |
cached top of data stack |
ECX |
scratch |
EDX |
scratch |
ESI |
Forth User area pointer |
EDI |
Forth local variable pointer |
EBP |
Forth data stack pointer - points to NOS |
ESP |
Forth return stack pointer |
All unused registers may be freely used by assembler routines, but they may be altered by the operating system or wrapper calls. Before calling the operating system, all of the Forth registers should be preserved. Before using a register that the Forth system uses, it should be preserved and then restored on exit from the assembler routine. Be aware, in particular, that callbacks will generally modify the EAX register since this is used to hold the value returned from them.
USE32
USE16
The first of these specifies that code from that point onwards is for a
32-bit segment. The second directive specifies that, from that point
onwards, code generated is for a 16-bit segment. The default is
USE32
. These directives should be used outside a code
definition, not within a definition.
It is possible to override the default segment size on an instruction-by-instruction basis. This is detailed later.
The assembler is designed to be very closely compatible with MASM and other assemblers. To this end the assembler assembles code written in the conventional prefix notation. However, because code may be converted from other MPE Forth systems, the postfix notation is also supported. The default mode is prefix. The directives to switch mode are as follows:
PREFIX
POSTFIX
These switch the assembler from then onwards into the new mode. The directives should be used outside a code definition, not within one. Their use within a code definition will lead to unpredictable results.
The assembler syntax follows very closely that of other 80386 assemblers. The major difference being that the VFX Forth assembler needs white space around everything. For example, where in MASM one might define:
MOV EAX,10[EBX]
we must write:
MOV EAX , 10 [EBX]
This distinction must be borne in mind when reading the following addressing mode information.
Many instructions have a register to register form. Both operands are registers. Such an instruction is of the form:
MOV EAX , EBX
This moves the contents of EBX into EAX. For compatibility with older MPE assemblers the first operand may be merged with the comma thus:
MOV EAX, EBX
This use of a register name with a 'built-in' comma also applies to other addressing modes.
If the assembler is set for direct-as-default (the MPE
directive has been used), immediate numbers must be defined
explicitly. This is done by the use of a hash (#) character:
MOV EAX , # 23
This example places the number 23 in EAX. The directives
OFFSET
and SEG
are synonyms for #
.
By default, the assembler is set to immediate-as-default (the
INTEL
directive has been used). In this case immediate
numbers do not have to be specifically defined:
MOV EAX , 23
The above code also places the number 23 in EAX.
If the assembler is set for direct-as-default (the MPE
directive has been used), direct addresses need not be
defined explicitly:
MOV EAX , 23
This example places the contents of address 23 in EAX. If the
assembler is set for immediate-as-default (the INTEL
directive has been used), direct addresses have to be
specifically defined, using the PTR
or []
directives:
MOV EAX , PTR 23
MOV EAX , [] 23
Both the above code fragments also place the contents of address 23 in EAX.
Intel define an addressing mode using a base and a displacement. In this mode, the effective address is calculated by adding the displacement to the contents of the base register. An example:
MOV EBX , # 0100
MOV EAX , 10 [EBX]
In this example, EAX is filled with the contents of address 0100+10, or address 110.
The assembler lays down different modes for displacements of 8-bit or 32-bit size, but this is internal to the assembler. The following registers may be used as base registers with a displacement:
[EAX] [ECX] [EDX] [EBX] [EBP] [ESI] [EDI]
If the displacement is zero then the assembler internally defines the mode as Base only. However, the displacement of zero must be supplied to the assembler:
MOV EBX , # 0100
MOV EAX , 0 [EBX]
This places in EAX the contents of address 100 (pointed to by EBX).
The following registers may be used as a base with no displacement:
[EAX] [ECX] [EDX] [EBX] [ESI] [EDI]
The 80386 also allows two registers to be used to indirectly address memory. These are known as the base and the index. Such instructions are of the form:
MOV EAX , # 100
MOV EBX , # 200
MOV EDX , 10 [EAX] [EBX]
This will place in EDX the contents of address 100+200+10, or address 310. EAX is the base and EBX is the index. Again, the displacement may be 8-bits, 32-bits or have a value of zero. The assembler distinguishes between these three cases. The base and index registers may be any of the following:
[EAX] [EBX] [ECX] [EDX] [ESI] [EDI]
In addition, [EBP] may be used as the index register, and [ESP] may be used as the base register.
The 80386 further supports an addressing mode where the index register is automatically scaled by a fixed amount - either 2, 4 or 8. This is designed for indexing into two-dimensional arrays of elements of size greater than byte-size. One register may be used as the first index, another for the second index, and the word size becomes implicit in the instruction. The form of this addressing mode is very similar to that outlined above, with the exception that the index operand includes the number which is the scale:
MOV EBX , # 100
MOV ECX , # 2
MOV EAX , 10 [EBX] [ECX*4]
This stores into EAX, the contents of address 100+(4*2)+10, or address 118. The list of registers which may be used as base is the same as the above. The list of scaled indexes is as follows:
[EAX*2] [ECX*2] [EDX*2] [EBX*2] [EBP*2] [ESI*2] [EDI*2]
[EAX*4] [ECX*4] [EDX*4] [EBX*4] [EBP*4] [ESI*4] [EDI*4]
[EAX*8] [ECX*8] [EDX*8] [EBX*8] [EBP*8] [ESI*8] [EDI*8]
Some instructions may be prefixed with a segment override. These force data addresses to refer to a segment other than the data segment. The override must precede the instruction to which it relates:
MOV EBX , # 100
ES: MOV EAX , 10 [EBX]
This will set EAX to the value contained in address 110 in the extra segment. The list of segment overrides is:
CS: DS: ES: FS: GS: SS:
The default data size for a USE32
segment is 32-bit, but
the default data size for a USE16
segment is 16-bit. These
are the default data sizes the assembler will use. If the data
is of a different size a data size override will have to be used.
To define the size of the data the following size specifiers
are used:
BYTE or B.
WORD or W.
DWORD or D.
QWORD
TBYTE
FLOAT
DOUBLE
EXTENDED
It is only necessary to specify size when ambiguity would otherwise arise. For example:
MOV 0 [EDX], # 10 \ can't tell
MOV 0 [EDX], EAX \ EAX specifies
The BYTE
size defines that a byte operation is required:
MOVZX EAX , BYTE 10 [EBX]
The abbreviation B.
may also be used in place of
BYTE
to define a byte operation. The WORD
specifier defines that 16-bits are required:
MOV AX , WORD 10 [EBX]
The abbreviation W.
may also be used to define a word
operation. DWORD
is the default for a USE32
segment, and indicates that 32-bit data is to be used:
MOV EAX , DWORD 10 [EBX]
FSTP DWORD 10 [EBX]
The abbreviation D.
may also be used to specify a
DWORD
operation. The remaining size specifiers define
data sizes for the floating point unit.
QWORD
defines a 64-bit operation:
FSTP QWORD 10 [EBX]
TBYTE defines a 10-byte (80-bit) operation, such as:
FSTP TBYTE 10 [EBX]
FLOAT
, DOUBLE
and EXTENDED
are synonyms for
DWORD
, QWORD
and TBYTE
respectively.
The segment type defines the default data size and address size for the code in the segment. If needed, it is possible to force the data size or the address size laid down to be the other. There is a set of data and address size overrides which work for one instruction only. These are:
D16:
D32:
A16:
A32:
and they would be used as follows:
D16: MOV EAX , # 23
A16: MOV EAX , 10 [EBX]
The first of these, in a USE32
segment, would lay down
16-bit data to be loaded into AX. The second would lay down
a 16-bit offset from [EBX] for the effective address in the
instruction. The situation would be reversed in a USE16
segment - the A32:
and D32:
directives would cause
32-bit data or addresses to be laid.
Jumps and branches may be either intra-segment or inter-segment. The former is a short branch or call whilst the latter is a long branch or call. The assembler is able to lay down either form. The default for a JMP or a CALL is near, whilst the default for a conditional branch is short. RET follows the same pattern as CALL. The directives supporting short/long and near/far are:
SHORT LONG NEAR FAR
These would be used as follows:
2 CONSTANT THAT \ the segment number
LBL: THIS \ the address
CALL THIS
CALL NEAR THIS
CALL FAR THAT THIS
JMP THIS
JMP NEAR THIS
JMP FAR THAT THIS
JCC THIS
JCC SHORT THIS
JCC LONG THIS
RET THIS
RET NEAR THIS
RET FAR THAT THIS
For compatibility with older MPE assemblers the mnemonics
CALL/F
, RET/F
and JMP/F
are also provided.
The assembler in VFX Forth follows both the syntax and the mnemonics defined in the Intel Programmers Reference books, for both the 80386 and the 80387. However, there are certain exceptions. These are listed below.
The zero operand forms of certain stack register instructions for the 80387 have been omitted. Their functionality is supported however. Such instructions are listed below, with a form of the syntax which will support the function:
FADD FADDP ST(1) , ST
FCOM FCOM ST(1)
FCOMP FCOMP ST(1)
FDIV FDIVP ST(1) , ST
FDIVR FDIVRP ST(1) , ST
FMUL FMULP ST(1) , ST
FSUB FSUBP ST(1) , ST
FSUBR FSUBRP ST(1) , ST
Certain 80386 instructions have either one operand or two operands, of which
only one is variable. These instructions are:
MUL DIV IDIV NEG NOT
These instructions take only one operand in the VFX Forth assembler.
If you need to use labels within a code definition, you may use the local labels provided. These are used just like labels in a normal assembler, but some restrictions are applied.
Ten labels are pre-defined, and their names are fixed. Additional
labels can be defined up to a maximum of 32. There is a limit of 128 forward
references.
A reference to a label is valid until the next occurrence of
LBL:
, CODE
or ;CODE
, whereupon all the
labels are reset.
A reference to a label in a definition must be satisfied in that definition. You cannot define a label in one code definition and refer to it from another.
The local labels have the names L$1 L$2 ... L$10
and
these names should be used when referring to them e.g.
JNE L$5
A local label is defined by words of the same names, but with a colon as a suffix:
L$1: L$2: ... L$10:
Additional labels (up to a maximum of 32 altogether) may be referred to by:
n L$
where n is in the range 11..32 (decimal), and they may be defined by:
n L$:
where n is again in the range 11..32 (decimal).
This assembler is designed to cope with CPUs from 80386 upwards. Some instructions are only available on later CPUs. Note that CPU selection affects the assembler and the VFX code code generator, not the run time of your application. If you select a higher CPU level than the application runs on, incorrect operation will occur.
CPU=386 \ -- ; select base instruction set
CPU=PPro \ -- ; Pentium Pro and above with CMOVcc
CPU=P4 \ -- ; Pentium 4 and above
PPro? \ -- flag ; true if at least Pentium Pro
P4? \ -- flag ; true if at least Pentium 4
The VFX code generator also uses this information to enable various code generation techniques. For VFX Forth for DOS, the default selection is for 386 class CPUs, for all others it is for the Pentium 4 instruction set.
Structures like the Forth control structures have been added to the assembler. They allow forward branches without the need for labels and impose the strictures of structured programming to the assembler level. Devotees of spaghetti programming are free to go their own way as the copious supply of branch instructions are still available, and the local label facility may be used with all branch instructions.
The status flag indicator required must prefix conditional structures. The
structure assembled will have a branch opcode that is the logical inverse of
the one specified. Thus for EQ, IF,
a JNE will be assembled so that the code
after IF,
is executed if the EQ status occurs. The assembler structure words
end in a comma e.g. IF,
to differentiate them from the regular Forth
structures, and to indicate that code is being generated.
The structures are described below, and the symbol cc condition code) may be any one of the following:
Z, |
equal to 0 |
NZ, |
not equal to 0 |
S, |
less than 0 |
NS, |
greater than or equal to 0 |
L, |
less than |
GE, |
greater than or equal |
LE, |
less than or equal |
G, |
greater than |
B, |
unsigned less than - address compares |
AE, |
unsigned greater than or equal |
BE, |
unsigned less than or equal |
A, |
unsigned greater than |
O, |
overflow |
NO, |
no overflow |
PE, |
parity even |
PO, |
parity odd |
CY, |
carry flag set |
NC, |
carry flag not set |
NCXZ, |
ECX/CX register non-zero |
The structure words build sets of assembler branches to perform functions equivalent to their high-level namesakes, but the names end with a comma to distinguish them from the high-level Forth words. Be sure you understand the high level structures before using the assembler equivalents. The structures are:
cc IF, ... THEN,
cc IF, ... ELSE, ... THEN,
BEGIN, ... AGAIN,
BEGIN, ... cc UNTIL,
BEGIN, ... cc WHILE, ... REPEAT,
An additional structure allows a section of code to be performed 'n' times. All it actually does is to load ECX with 'n'and mark the start of a backward branch so that the mark may be used later. The structure is:
n TIMES, ... LOOP,
: mpe \ -- ; force def # addressing
Select the MPE default addressing mode, in which
the default addressing mode is direct addressing.
This is provided for compatibility with legacy MPE
systems.
The indicators []
and #
can be used for code
which must be compiled in either condition.
: intel \ -- ; force def.direct addressing
Select the INTEL default addressing mode, in which
the default addressing mode is immediate addressing.
The indicators []
and #
can be used for code
which must be compiled in either condition.
: prefix \ -- ; select prefix mode
Set the assembler to use prefix notation with the opcode first.
This is the default condition.
: postfix \ -- ; select postfix mode
Set the assembler to use postfix notation with the opcode
after the operands.
Because of the performance of the VFX optimiser, use of assembler is only necessary when defining new compilation structures. Otherwise the use of assembler code should be avoided.
Assembler macros are defined as follows:
MASM: <name>
<assembler code goes here>
;MASM
e.g. the following macro pops the top of the NDP stack to the external
floating point stack.
MASM: popFPU \ -- ; pops FTOS to float stack
mov eax, FSP-OFFSET [esi] \ get FP stack pointer
lea eax, -FPCELL [eax] \ update stack pointer
fstp fword 0 [eax] \ store and pop
mov FSP-OFFSET [esi], eax \ restore FP stack pointer
;MASM
In line assembler code may be compiled into the middle of a colon definition by using the phrase:
: <name> \ just another Forth word
...
[ASM <insert assembler here> ASM]
...
;
A fragment of assembler for compilation when the containing word is executed can be defined by using the following:
: a-compiler \ will compile some assembler
...
a[ <fragment to be compiled> ]a
...
;
The following example compiles an in-line floating point literal.
: o_flit, \ F: f -- ; F: -- f ; compile floating point literal
a[ popFPU \ references a previous macro
jmp here 2+ FPCELL + \ skip inline literal
]a
f,
a[ fld fword ptr here FPCELL - ]a
;
When using in-line code generators such as [ASM ...ASM]
you should flush the code generator contents with [O/F]
.
[O/F] [ASM ... ASM]
After [ASM
the top of the data stack will be in EBX
with all other stack items pointed to by EBP. The code
generator expects this same state to exist after ASM]
.
: dxb \ b -- ; lay byte
Lay a byte into the instruction stream. Use in the form:
dxb $55
: dxw \ w -- ; lay 16 bits
Lay a 16-bit word into the instruction stream. Use in the form:
dxw $55AA
: dxl \ l -- ; lay 32 bit long
Lay a 32-bit dword into the instruction stream. Use in the form:
dxl $11223344
: $ \ -- chere
Return the PC value of the start of the instruction.
#-701 Invalid addressing mode
#-702 N not in range -128..+127
#-703 Label reference number out of range
#-704 Label definition number out of range
#-705 Invalid instruction for selected CPU type