Last updated: 25 May 2012
Change Notes
25 May 2012: Legacy notation support.
9 May 2011: Comment handling, better handling of pointers ('*').
Accessing DLLs and shared libraries
Introduction
The EXTERN library supports calling external API calls in dynamic link libraries (DLLs) for Windows and shared libraries in Linux and other Unix-derived operating systems such as OS/X. Various API libraries export functions in a variety of methods mostly transparent to programmers in languages such as C, Pascal and Fortran.
The notation is currently supported by VFX Forth (Win/Mac/Lin) and iForth (Win/Mac/Lin). The following versions are provided:
- VFX Forth for Windows or Linux externVFX.fth. Documentation in externVFX.html. Floating point data is supported for use with Lib\Ndp387.fth.
- iForth (all versions) port by Hanno Schwalm. The file dynlibs.frt is version 1.22 and is far from perfect, it just works fine in some projects on all iforth ports. So far so good, but there are some ideas in implementation that are really different. It relies on other iforth specific code which is deep inside the iforth kernel to make library calls as fast/stable as possible. The shipping version is 2.00.
Feel free to port this code to other systems. When you have a working port, please contribute the code to us so that everyone can share it. A port will be heavily dependent both on the host Forth and the operating system.
Contact
Usage
Before a library function can be used, the library itself must be declared, e.g.
|
Access to functions in a library is provided by the EXTERN: syntax which is similar to a C style function prototype, e.g.
|
This can be used to prototype the function SendMessage from the Microsoft Windows API, and produces a Forth word SendMessage.
SendMessage \ hwnd mesg wparam lparam -- int
For Linux and other Unices, the same notation is used. The default calling convention is nearly always applicable. The following example shows that definitions can occupy more than one line. It also indicates that some token separation may be necessary for pointers:
|
This produces a Forth word execve.
execve \ path argv envp -- int
The parser used to separate the tokens is not ideal. If you have problems with a definition, make sure that * tokens are white-space separated. Formal parameter names, e.g. argv above are ignored. Array indicators, [] above, are also ignored when part of the names.
The input types may be followed by a dummy name which is discarded. Everything on the source line after the closing ')' is discarded.
The use of PASCAL is not required as PASCAL is the default calling convention in the Windows versions. The default for Linux versions is "C". The default is always used unless overridden in the declaration.
Format
|
Note that during searches <name> is passed to the operating system exactly as it is written, i.e. case sensitive. The Forth name is case-insensitive.
As a standard Forth's string length for dictionary names is only guaranteed up to 31 characters for portable source code, very long API names can cause problems. Therefore the word ALIASEDEXTERN: allows separate specification of API and Forth names (see below). ALIASEDEXTERN: also solves problems when API functions only differ in case or their names conflict with existing Forth word names.
Calling Conventions
In the discussion caller refers to the Forth system (below the application layer and callee refers to a a function in a DLL or shared library. The EXTERN: mechanism supports three calling conventions.
- C-Language: "C"
Caller tidies the stack-frame. The arguments (parameters) which are passed to the library are reordered. This convention can be specified by using "C" after the return type specifier and before the function name. - Pascal language: "PASCAL"
Callee removes arguments from the stack frame. This is invisible to the programmer at the application layer The arguments (parameters) which are passed to the library are not reordered. This convention is specified by "PASCAL" after the return type specifier and before the function name. - Windows API: WINAPI | PASCAL | STDCALL
In nearly all cases (but not all), calls to Windows API functions require C style argument reversal and the called function cleans up. Specify this convention with PASCAL, WinAPI or StdCall after the return type specifier and before the function name.
Unless otherwise specified, the Forth system's default convention is used. Under Windows this is WINAPI and under Linux and other Unices it is "C".
Promotion and Demotion
The system generates code to either promote or demote non-CELL sized arguments and return results which can be either signed or unsigned. Although Forth is an un-typed language it must deal with libraries which do have typed calling conventions. In general the use of non-CELL arguments should be avoided but return results should be declared in Forth with the same size as the C or PASCAL convention documented.
Argument Reversal
The default calling convention for the host operating system is used. The right-most argument/parameter in the C-style prototype is on the top the Forth data stack. When calling an external function the parameters are reordered if required by the operating system; this is to enable the argument list to read left to right in Forth source as well as in the C-style operating system documentation.
Under certain conditions, the order can be reversed. See the words "C" and "PASCAL" which define the order for the operating system. See L>R and R>L which define the Forth stack order with respect to the arguments in the prototype.
C comments in declarations
Very rudimentary support for C comments in declarations is provided, but is good enough for the vast majority of declarations.
- Comments can be // ... or /* ... */,
- Comments must be at the end of the line,
- Comments are treated as extending to the end of the line,
- Comments must not contain the ')' character.
The example below is taken from a SQLite interface.
|
Controlling external references
1 value ExternWarnings? \ -- n
Set this true to get warning messages when an external reference
is redefined.
0 value ExternRedefs? \ -- n
If non-zero, redefinitions of existing imports are permitted.
Zero is the default for VFX Forth so that redefinitions of
existing imports are ignored.
1 value InExternals? \ -- n
Set this true if following import definitions are to be in
the EXTERNALS vocabulary, false if they are to go into
the wordlist specified in CURRENT. Non-Zero is the
default for VFX Forth.
: InExternals \ --
External imports are created in the EXTERNALS vocabulary.
: InCurrent \ --
External imports are created in the wordlist specified by
CURRENT.
Library Imports
In VFX Forth, libraries are held in the EXTERNALS vocabulary, which is part of the minimum search order. Other Forth systems may use the CURRENT wordlist.
For turnkey applications, initialisation, release and reload of required libraries is handled at start up.
variable lib-link
Anchors the chain of dynamic/shared libraries.
: init-libs \ --
Release and reload the required libraries.
: find-libfunction \ z-addr -- address|0
Given a zero terminated name, attempt to find the named function
somewhere within the already active libraries.
: .Libs \ --
Display the list of declared libraries.
: Library: \ "<name>" -- ; -- loadaddr|0
Register a new library by name.
Use in the form:
LIBRARY: <name>
Executing <name> later will return its load address. This is useful when checking for libraries that may not be present. After definition, the library is the first one searched by import declarations.
: topLib \ libfn --
Make the library structure the top/first in the library
search order.
: firstLib \ "<name>" --
Make the library top of the library search order. Use during
interpretation in the form:
FirstLib <name>
to make the library top of the search order. This is useful when you know that there may be several functions of the same name in different libraries.
: [firstLib] \ "<name>" --
Make the library top of the library search order. Use during
compilation in the form:
[firstLib] <name>
to make the library top of the search order. This is useful when you know that there may be several functions of the same name in different libraries.
Function Imports
Function declarations in shared libraries are compiled into the EXTERNALS vocabulary. They form a single linked list. When a new function is declared, the list of previously declared libraries is scanned to find the function. If the function has already been declared, the new definition is ignored if ExternRedefs? is set to zero. Otherwise, the new definition overrides the old one as is usual in Forth.
In VFX Forth, ExternRedefs? is set by default to zero.
variable import-func-link \ -- addr
Anchors the chain of imported functions in shared libraries.
: ExternLinked \ c-addr u -- address|0
Given a string, attempt to find the named function in the
already active libraries. Returns zero when the function is
not found.
: init-imports \ --
Initialise Import libraries. INIT-IMPORTS is called by
the system cold chain.
: InExternals \ --
External imports are created in the EXTERNALS vocabulary.
: InCurrent \ --
External imports are created in the wordlist specified by
CURRENT.
: Extern: \ "text" --
Declare an external API reference. See the syntax above.
The Forth word has the same name as the function in the
library, but the Forth word name is not case-sensitive.
The length of the function's name may not be longer than a
Forth word name.
: AliasedExtern: \ "forthname" "text" --
Like EXTERN: but the declared external API reference
is called by the explicitly specified forthname.
The Forth word name follows and then the API name.
Used to avoid name conflicts, e.g.
AliasedExtern: saccept int accept( HANDLE, void *, unsigned int *);
which refernces the Winsock accept function but gives it the Forth name SACCEPT. Note that here we use the fact that formal parameter names are optional.
: LocalExtern: \ "forthname" "text" --
As AliasedExtern:, but the import is always built into
the CURRENT wordlist.
: extern \ "text" --
An alias for EXTERN:.
: .Externs \ -- ; display EXTERNs
Display a list of the external API calls.
: .BadExterns \ --
Display a list of any external API calls that have not been
resolved.
: func-loaded? \ xt -- addr|0
Given the XT of a word defined by EXTERN: or friends,
returns the address of the DLL function in the DLL,
or 0 if the function has not been loaded/imported yet.
Pre-Defined parameter types
The types known by the system are all found in the vocabulary TYPES. You can add new ones at will. Each TYPE definition modifies one or more of the following VALUEs. )
argSIZE |
Size in bytes of data type. |
argDEFSIGN |
Default sign of data type if no override is supplied. |
argREQSIGN |
Sign OverRide. This and the previous use 0 = unsigned and 1 = signed. |
argISPOINTER |
1 if type is a pointer, 0 otherwise |
Each TYPES definition can either set these flags directly or can be made up of existing types.
Note that you should explicitly specify a calling convention for every function defined.
Calling conventions
: "C" \ --
Set Calling convention to "C" standard. Arguments are
reversed, and the caller cleans up the stack.
: "PASCAL" \ --
Set the calling convention to the "PASCAL" standard as used
by Pascal compilers. Arguments are not reversed, and the
called routine cleans up the stack. Note that this is not
the same as the word PASCAL below.
: PASCAL \ --
Set the calling convention to the Windows PASCAL standard.
Arguments are reversed in C style, but the called routine
cleans up the stack. This is the standard Win32 API calling
convention. N.B. There are exceptions!
This convention is also called "stdcall" and "winapi" by
Microsoft, and is commonly used by Fortran programs.
: WinApi \ --
A synonym for PASCAL.
: StdCall \ --
A synonym for PASCAL.
: R>L \ --
By default, arguments are assumed to be on the Forth stack
with the top item matching the rightmost argument in the
declaration so that the Forth parameter order matches that
in the C-style declaration.
R>L reverses this.
: L>R \ --
By default, arguments are assumed to be on the Forth stack
with the top item matching the rightmost argument in the
declaration so that the Forth parameter order matches that
in the C-style declaration.
L>R confirms this.
Basic Types
: unsigned \ --
Request current parameter as being unsigned.
: signed \ --
Request current parameter as being signed.
: int \ --
Declare parameter as integer. This is a signed 32 bit quantity
unless preceeded by unsigned.
: char \ --
Declare parameter as character. This is a signed 8 bit quantity
unless preceeded by unsigned.
: void \ --
Declare parameter as void. A VOID parameter has no
size. It is used to declare an empty parameter list, a null
return type or is combined with * to indicate a generic
pointer.
: * \ --
Mark current parameter as a pointer.
: const ; \ --
Marks next item as constant in C terminology. Ignored.
: int32 \ --
A 32bit signed quantity.
: int16 \ --
A 16 bit signed quantity.
: int8 \ --
An 8 bit signed quantity.
: uint32 \ --
32bit unsigned quantity.
: uint16 \ --
16bit unsigned quantity.
: uint8 \ --
8bit unsigned quantity.
: LongLong \ --
A 64 bit signed or unsigned integer. At run-time, the argument
is taken from the Forth data stack as a normal Forth double
with the top item on the top of the data stack.
: LONG int ;
A 32 bit signed quantity.
: SHORT \ --
For most compilers a short is a 16 bit signed item,
unless preceded by unsigned.
: BYTE \ --
An 8 bit unsigned quantity.
: float \ --
32 bit float.
: double \ --
64 bit float.
Windows Types
The following parameter types are non "C" standard and are used by Windows in function declarations. They are all defined in terms of existing types.
: OSCALL PASCAL ;
Used for portable code to avoid three sets of declarations.
For Windows, this is a synonym for PASCAL and under
Linux and other Unices this is a synonym for "C".
: DWORD unsigned int ;
32 bit unsigned quantity.
: WORD unsigned int 2 to argSIZE ;
16 bit unsigned quantity.
: HANDLE void * ;
HANDLEs under Windows are effectively pointers.
: HMENU handle ;
A Menu HANDLE.
: HDWP handle ;
A DEFERWINDOWPOS structure Handle.
: HWND handle ;
A Window Handle.
: HDC handle ;
A Device Context Handle.
: HPEN handle ;
A Pen Handle.
: HINSTANCE handle ;
An Instance Handle.
: HBITMAP handle ;
A Bitmap Handle.
: HACCEL handle ;
An Accelerator Table Handle.
: HBRUSH handle ;
A Brush Handle.
: HMODULE handle ;
A module handle.
: HENHMETAFILE handle ;
A Meta File Handle.
: HFONT handle ;
A Font Handle.
: HRESULT DWORD ;
A 32bit Error/Warning code as returned by various COM/OLE calls.
: LPPOINT void * ;
Pointer to a POINT structure.
: LPACCEL void * ;
Pointer to an ACCEL structure.
: LPPAINTSTRUCT void * ;
Pointer to a PAINTSTRUCT structure.
: LPSTR void * ;
Pointer to a zero terminated string buffer which may be modified.
: LPCTSTR void * ;
Pointer to a zero terminated string constant.
: LPCSTR void * ;
Another string pointer.
: LPTSTR void * ;
Another string pointer.
: LPDWORD void * ;
Pointer to a 32 bit DWORD.
: LPRECT void * ;
Pointer to a RECT structure.
: LPWNDPROC void * ;
Pointer to a WindowProc function.
: ATOM word ;
An identifier used to represent an atomic string in the OS table.
See RegisterClass() in the Windows API for details.
: WPARAM dword ;
A parameter type which used to be 16 bit but under Win32 is an
alias for DWORD.
: LPARAM dword ;
Used to mean LONG-PARAMETER (i.e. 32 bits, not 16 as under Win311)
and is now effectively a DWORD.
: UINT dword ;
Windows type for unsigned INT.
: BOOL int ;
Windows Boolean type. 0 is false and non-zero is true.
: LRESULT int ;
Long-Result, under Win32 this is basically an integer.
: colorref DWORD ;
A packed encoding of a color made up of 8 bits RED, 8 bits GREEN,
8 bits BLUE and 8 bits ALPHA.
: SOCKET dword ;
Winsock socket reference.
Linux Types
: OSCALL "C" ;
Used for portable code to avoid three sets of declarations.
For Windows, this is a synonym for PASCAL and under
Linux this is a synonym for "C".
: FILE uint32 ;
Always use as FILE * stream.
: DIR uint32 ;
Always use as DIR * stream.
: size_t uint32 ;
Linux type for unsigned INT.
: off_t uint32 ;
Linux type for unsigned INT.
: int32_t int32 ;
Synonym for int32.
: int16_t int16 ;
Synonym for int16.
: int8_t int8 ;
Synonym for int8.
: uint32_t uint32 ;
Synonym for uint32.
: uint16_t uint16 ;
Synonym for uint16.
: uint8_t uint8 ;
Synonym for uint8.
: time_t uint32 ;
Number of seconds since midnight UTC of January 1, 1970.
: clock_t uint32 ;
Processor time in terms of CLOCKS_PER_SEC.
: pid_t int32 ;
Process ID.
: uid_t uint32 ;
User ID.
: mode_t uint32 ;
File mode.
Mac OS X Types
: OSCALL "C" ;
Used for portable code to avoid three sets of declarations.
For Windows, this is a synonym for PASCAL and under
OS X this is a synonym for "C".
: FILE uint32 ;
Always use as FILE * stream.
: DIR uint32 ;
Always use as DIR * stream.
: size_t uint32 ;
Unix type for unsigned INT.
: off_t uint32 ;
Unix type for unsigned INT.
: int32_t int32 ;
Synonym for int32.
: int16_t int16 ;
Synonym for int16.
: int8_t int8 ;
Synonym for int8.
: uint32_t uint32 ;
Synonym for uint32.
: uint16_t uint16 ;
Synonym for uint16.
: uint8_t uint8 ;
Synonym for uint8.
: time_t uint32 ;
Number of seconds since midnight UTC of January 1, 1970.
: clock_t uint32 ;
Processor time in terms of CLOCKS_PER_SEC.
: pid_t int32 ;
Process ID.
: uid_t uint32 ;
User ID.
: mode_t uint32 ;
File mode.