Floating Point

Introduction

The floating point packages use the FPU instruction set. The source code can be found in the files Lib\x86\Ndp387.fth, Lib\x86\Hfp387.fth and Ndp387.fth is the primary package, and produces the fastest and smallest code. Lib\x86\HfpGL32.fth has been retired. VFX Forth v5.1+ with recognisers is required.

Ndp387.fth - coprocessor stack

In Ndp387.fth floating point numbers are kept in the floating point unit's internal stack only. This code is significantly faster than when using an external stack, but is limited to the use of 8 floats on the NDP stack, including any working temporary numbers.

By default, Ndp387.fth defines floating point stack items and literals to be in 80 bit extended real format. If you need to save (a small amount of) space, the default can be changed by setting the constant FPCELL to 8 for 64 bit double precision or to 4 for 32 bit single precision. If you do this, accuracy and resolution may/will suffer.

From VFX Forth 3.70 onwards, Ndp387.fth includes an optimiser. According to the results of Examples\Benchmrk\mm.fth this nearly doubles the overall performance of the matrix multiply floating point benchmark code. Well tuned algorithms may see speed improvements of over three times. Use of the Pentium optimisations with +IDATA is recommended for performance critical floating point code.

The file Ndp387.v36.fth contains the last release of the unoptimised code.

Hfp387.fth - external FP stack

In Hfp387.fth floating point numbers are kept on a separate stack, pointed to by the USER variables FSP and FS0. The top of the FP stack is cached in the FPU.

All tasks, winprocs and callbacks are allocated a separate 4096 byte floating point stack. If you need a larger one, allocate it from the heap using ALLOCATE, and modify FSP and FS0 accordingly. Note that the stack grows down.

By default, Hfp387.fth defines floating point stack items and literals to be in 80 bit extended real format. If you need to save a bit of space, the default can be changed by setting the constant FPCELL to 8 for 64 bit double precision or to 4 for 32 bit single precision. If you do this, accuracy and resolution may/will suffer.

If you are uncertain of the state of the floating point unit, you can initialise it using FINIT ( -- ). After FINIT, the internal floating point stack is empty.

Radians and Degrees

Please note that the trig functions are calculated in radians, for calculations in degrees use DEG>RAD beforehand, RAD>DEG afterwards.

Number formats, ANS and Forth200x

The ANS Forth standard specifies that floating point numbers must be entered in the form 1.234e5 and must contain a point '.' and 'e' or 'E', and that double integers are terminated by a point '.'.

This situation prevents the use of the standard conversion words in international applications because of the interchangable use of the '.' and ',' characters in numbers. Because of this, VFX Forth uses two four-byte arrays, FP-CHAR and DP-CHAR, to hold the characters used as the floating point and double integer indicator characters. By default, FP-CHAR is initialised to '.' and DP-CHAR is initialised to to ',' and '.'. For strict ANS and Forth-2012 compliance, you should set them as follows:


\ ANS standard setting
  char . dp-char !
  char . fp-char !
: ans-floats    \ -- ; for strict ANS compliance
  [char] . dp-char !
  [char] . fp-char !
;
\ MPE defaults
  char , dp-char !
  char . dp-char 1+ c!
  char . fp-char !
: mpe-floats    \ -- ; for existing and most legacy code
  [char] , dp-char !
  [char] . dp-char 1+ c!
  [char] . fp-char !
;
\ Legacy defaults, including ProForth
  char , dp-char !
  char . fp-char !
: legacy-floats \ -- ; for legacy code
  [char] , dp-char !
  [char] . fp-char !
;

You can of course set these arrays to hold any values which suit your application's language and locale. Note that integer conversion is always attempted before floating point conversion. This means that if the FP-CHAR and DP-CHAR arrays contain the same character, floating point numbers must contain 'e' or 'E'. If the arrays are all different, a number containing the FP-CHAR will be successfully converted as a floating point number, even if it does not contain 'e' or 'E'.

As of January 2007, recommendations made to the Forth200x standards effort have been adopted by MPE for REPRESENT. The impact of these changes is that the minimum buffer size used for REPRESENT should be at least #FDIGITS characters, normally 18 bytes. For details of the proposal, see:

  Examples/usenet/Ed/Represent_20.txt
  Examples/usenet/Ed/Represent_30.txt

Floating point exceptions

Exception handling is determined by the operating system. On current Windows platforms, floating point exceptions are not generated by the NDP. This can be changed by altering the bottom 6 bits of the NDP control word using CW@ and CW!.

By default, the system prompt will report exception status, and clear the pending exception status. Exception status reporting does not mean that the Windows exception handler has been triggered, it only means that the status flag has been set.

Standards compliance, F>S and F>D

After much discussion on the comp.lang.forth newsgroup, a consensus was reached that F>D and F>S must truncate to zero. This is also the behaviour required by the Forth Scientific Library (FSL). Historically, MPE floating point packs permit the integer rounding mode to be set by the user. In order to support both camps, VFX Forth now behaves as follows:

Configuration

0 value FpSystem
The value FPSYSTEM defines which floating point pack is installed and active. Each floating point pack defines its own type as follows:

#10 constant FPCELL     \ -- n
Defines the size of literals and floating point numbers in memory and on floating point stacks in memory. FPCELL can be changed to 8 for 64 bit double precision or to 4 for 32 bit single precision. If you do this, accuracy and resolution may/will suffer.

constant #fdigits       \ -- u
Returns the largest number of usable digits available from REPRESENT. Equivalent to the environment variable MAX-FLOAT-DIGITS.

false constant [fpdebug] immediate
Set this true when compiling NDP387.FTH, and a debug build will be constructed. In this state, the state of the FPU is checked after each word. If a floating point exception has been generated, a diagnostic is issued, and the system aborts. Set this only when testing. Note that the NDP387 optimiser may well cause this to be ignored.

defer fpcheck                           \ see later for real action
A DEFERred word called at the end of CODE routines when [FPDEBUG] is non-zero.

variable signed-zero    \ -- addr
Set non-zero to display signed-zero.

1 constant FPext?       \ -- flag
Set non-zero to compile FP extensions.

Assembler macros

: fword         \ --
Selects appropriate floating point size for the assembler. Note that this is defined by the constant FPCELL. FWORD will be a synonym for FLOAT, DOUBLE or TBYTE.

: fnext,        \ -- ; can be changed for debugging
The equivalent of NEXT, for floating point routines. If [FPDEBUG] is non-zero, a call to FPCHECK is assembled.

Optimiser support

This code is only provided for Ndp387.fth in VFX Forth 3.70 onwards.

0 value FPsin?  \ -- flag
Returns non-zero if source inlining is permitted for words containing floating point code sequences. By default, FP source inlining is disabled.

: [+FPsin       \ -- x
Start a [+FPSIN ... FPSIN] section in which new FP code can be source inlined.

: [-FPsin       \ -- x
Start a [-FPSIN ... FPSIN] section in which new FP code cannot be source inlined.

: FPsin]        \ x --
End a [+/-FPSIN ... FPSIN] section. The previous FP source inliner state is preserved.

: fseq:         \ -- ; FSEQ: <name> ... ;FSEQ
Start an assembler sequence which is compiled for <name>.

: ;fseq         \ -- ; FSEQ: <name> ... ;FSEQ
Ends an assembler sequence started by FSEQ:.

FP constants

code %0         \ F: -- f#(0)
Floating point 0.0

code %1         \ F: -- f#(1)
Floating point 1.0

code %pi        \ F: -- f#(pi)
Floating point PI

code %pi/2      \ F: -- f#(pi/2)
Floating point PI/2

code %pi/4      \ F: -- f#(pi/4)
Floating point PI/4

code %lg2e      \ F: -- log2(e)
Returns log (base 2) of e.

FP control operations

code finit      \ F: -- ; resets FPU
Reset the floating point unit and NDP stack.

code cw@        \ -- cw ; get NDP control word
Return the floating point unit Control Word.

code cw!        \ cw -- ; set NDP control word
Set the floating point unit Control Word.

code sw@        \ -- sw ; get NDP status word
Return the floating point unit Status Word.

code fclex      \ -- ; clear exceptions
Clear any pending floating point exceptions.

FP Stack operations

code fdup       \ F: f -- f f
Floating point equivalent of DUP.

code fswap      \ F: f1 f2 -- f2 f1
Floating point equivalent of SWAP.

code F2SWAP     \ F: r1 r2 r3 r4 -- r3 r4 r1 r2
Floating point equivalent of 2SWAP.

code fdrop      \ F: f --
Floating point equivalent of DROP.

code fover      \ F: f1 f2 -- f1 f2 f1
Floating point equivalent of OVER.

code frot       \ F: f1 f2 f3 -- f2 f3 f1
Floating point equivalent of ROT.

code fpick      \ n -- ; F: -- f
Floating point equivalent of PICK. Note that because the pick index is an integer, it is on the normal Forth integer data stack, and the result, being a floating point number, is on the floating point stack.

code ndepth     \ -- n ; depth of NDP stack
Returns on the Forth data stack the number of items on the FPUs's internal working stack.

: fdepth        \ -- #f
Floating point equivalent of DEPTH. The result is reurned on the Forth data stack, not the float stack.

Memory operations SF@ SF! DF@ DF! etc

code f@         \ addr -- ; F: -- f
Places the contents of addr on the float stack. The size of the item fetched was defined by FPCELL at compile time.

code sf@        \ addr -- ; F: -- f
Places the 32 bit float at addr on the float stack.

code df@        \ addr -- ; F: -- f
Places the 64 bit double float at addr on the float stack.

code tf@        \ addr -- ; F: -- f
Places the 80 bit extended float at addr on the float stack.

code f!         \ addr -- ; F: f --
Stores the top of the float stack as an FPCELL sized number at addr.

code sf!        \ addr -- ; F: f --
Stores the top of the float stack as an 32 bit float number at addr.

code df!        \ addr -- ; F: f --
Stores the top of the float stack as an 64 bit double float number at addr.

code tf!        \ addr -- ; F: f --
Stores the top of the float stack as an 80 bit extended float number at addr.

code f+!        \ F: f -- ; addr -- ; add f to data at addr
Add F to the data at ADDR.

code f-!        \ F: f -- ; addr -- ; sub f from data at addr
Subtract F from the data at ADDR.

code sf+!       \ F: f -- ; addr -- ; add f to data at addr
Add F to the 32 bit float at ADDR. NDP387.FTH only.

code sf-!       \ F: f -- ; addr -- ; sub f from data at addr
Subtract F from the 32 bit float at ADDR. NDP387.FTH only.

code df+!       \ F: f -- ; addr -- ; add f to data at addr
Add F to the 64 bit float at ADDR. NDP387.FTH only.

code df-!       \ F: f -- ; addr -- ; sub f from data at addr
Subtract F from the 64 bit float at ADDR. NDP387.FTH only.

code tf+!       \ F: f -- ; addr -- ; add f to data at addr
Add F to the 80 bit float at ADDR. NDP387.FTH only.

code tf-!       \ F: f -- ; addr -- ; sub f from data at addr
Subtract F from the 80 bit float at ADDR. NDP387.FTH only.

code f@+        \ addr -- addr' ; F: -- f
Places the contents of addr on the float stack and increments the address. The size of the item fetched and the increment is defined by FPCELL. NDP387.FTH only.

code sf@+       \ addr -- addr' ; F: -- f
Places the 32 bit float at addr on the float stack, and increments addr by 4. NDP387.FTH only.

code df@+       \ addr -- addr' ; F: -- f
Places the 64 bit float at addr on the float stack, and increments addr by 8. NDP387.FTH only.

code tf@+       \ addr -- addr' ; F: -- f
Places the 80 bit float at addr on the float stack, and increments addr by 10. NDP387.FTH only.

code f!+        \ addr -- addr' ; F: f --
Stores the top of the float stack as an FPCELL sized number at addr, and updates addr appropriately. NDP387.FTH only.

code sf!+       \ addr -- addr' ; F: f --
Stores the top of the float stack as a 32 bit float at addr, and updates addr appropriately. NDP387.FTH only.

code df!+       \ addr -- addr' ; F: f --
Stores the top of the float stack as a 64 bit float at addr, and updates addr appropriately. NDP387.FTH only.

code tf!+       \ addr -- addr' ; F: f --
Stores the top of the float stack as an 80 bit float at addr, and updates addr appropriately. NDP387.FTH only.

Dictionary operations

: tf,           \ F: f --
Lays an 80 bit extended float into the dictionary, reserving 10 bytes

: df,           \ F: f --
Lays an 64 bit double float into the dictionary, reserving 8 bytes

: sf,           \ F: f --
Lays a 32 bit float into the dictionary, reserving 4 bytes

: f,            \ F: f --
lays a default float into the dictionary, reserving FPCELL bytes

: falign        \ --
Aligns the dictionary to accept a default float.

: faligned      \ addr -- addr'
Aligns the address to accept a default float.

: float+        \ addr -- addr'
Increments addr by FPCELL, the size of a default float.

: floats        \ n1 -- n2
Returns n2, the size of n1 default floats.

: sfalign       \ --
Aligns the dictionary to accept a 32 bit float.

: sfaligned     \ addr -- addr'
Aligns the address to accept a 32 bit float.

: sfloat+       \ addr -- addr'
Increments addr by the size of a 32 bit float.

: sfloats       \ n1 -- n2
Returns n2, the size of n1 32 bit floats.

: dfalign       \ --
Aligns the dictionary to accept a 64 bit double float.

: dfaligned     \ addr -- addr'
Aligns the address to accept a 64 bit float.

: dfloat+       \ addr -- addr'
Increments addr by the size of a 64 bit double float.

: dfloats       \ n1 -- n2
Returns n2, the size of n1 64 bit double floats.

: tfalign       \ --
Aligns the dictionary to accept an 80 bit extended float.

: tfaligned     \ addr -- addr'
Aligns the address to accept an 80 bit extended float.

: tfloat+       \ addr -- addr'
Increments addr by the size of an 80 bit extended float.

: tfloats       \ n1 -- n2
Returns n2, the size of n1 80 bit extended floats.

FP defining words

: fvariable     \ F: -- ; -- addr
Use in the form: FVARIABLE <name> to create a variable that will hold a default floating point number.

: farray                \ n -- ; i -- addr
Use in the form: n FARRAY <name> to create a variable that will hold a default floating point number. When the array name is executed, the index i is used to retun the address of the i'th 0 zero-based element in the array. For example, 5 FARRAY TEST will set up 5 array elements each containing 0, and then f n TEST F! will store f in the nth element, and n TEST F@ will fetch it.

: fconstant     \ F: f -- ; F: -- f
Use in the form: <float> FCONSTANT <name> to create a constant that will return a floating point number.

: fvalue        \ F: f -- ; ??? -- ???
Use in the form: <float> FVALUE <name> to create a floating point version of VALUE that will return a floating point number by default, and that can accept the operators TO, ADDR, ADD, SUB, and SIZEOF. )

Basic functions + - * / and others

code f+         \ f1 f2 -- f1+f2
Floating point add.

code f-         \ f1 f2 -- f1-f2
Floating point subtract.

code f*         \ f1 f2 -- f1*f2
Floating point multiply.

code f/         \ f1 f2 -- f1/f2
Floating point divide.

code fmod       \ F: f1 f2 -- f3
Floating point modulus. Returns f3 the remainder after repeatedly subtracting f2 from f1. Often used to force arguments to lie in the range: 0 <= arg < f2

code fsqrt      \ F: f -- sqrt(f)
Floating point square root.

code 1/f        \ F: f -- 1/f
Floating point reciprocal.

code fabs       \ F: f -- |f|
Floating point absolute.

code fnegate    \ F: f -- -f
Floating point negate.

code f2*        \ F: f -- f*2
Floating point multiply by two.

code f2/        \ F: f -- f/2
Floating point divide by two.

Integer to FP conversion

code s>f        \ n -- ; F: -- f
Converts a single integer to a float.

code d>f        \ d -- ; F: -- f
Converts a double integer to a float.

code f>s        \ F: f -- ; -- n ; convert float to integer
Converts a float to a single integer. Note that F>S truncates the number towards zero according to the ANS specification. See FR>S below.

code f>d        \ F: f -- ; -- d ; convert float to double integer
Converts a float to a double integer. Note that F>D truncates the number towards zero according to the ANS specification. See FR>D below.

code fr>s       \ F: f -- ; -- n ; convert float to integer
Converts a float to a single integer using the current rounding mode.

code fr>d       \ F: f -- ; -- d ; convert float to double integer
Converts a float to a double integer using the current rounding mode.

FP comparisons

: f0<           \ F: f1 -- ; -- t/f ; less than zero?
Floating point 0<. N.B. result is on the Forth integer data stack.

: f0=           \ F: f1 -- ; -- t/f ; equal zero?
Floating point 0=. N.B. result is on the Forth integer data stack.

: f0<>          \ F: f1 -- ; -- t/f ; not equal zero?
Floating point 0<>. N.B. result is on the Forth integer data stack.

: f0>           \ F: f1 -- ; -- t/f ; greater than zero?
Floating point 0>. N.B. result is on the Forth integer data stack.

: f<            \ F: f1 f2 -- ; -- t/f ; one less than the other?
Floating point <. N.B. result is on the Forth integer data stack.

: f=            \ F: f1 f2 -- ; -- t/f ; equal each other?
Floating point =. N.B. result is on the Forth integer data stack.

: f<>           \ F: f1 f2 -- ; -- t/f ; one not equal the other?
Floating point <>. N.B. result is on the Forth integer data stack.

: f>            \ F: f1 f2 -- ; -- t/f ; one less than the other?
Floating point >. N.B. result is on the Forth integer data stack.

: f<=           \ F: f1 f2 -- ; -- t/f ; one less or equal the other?
Floating point <=. N.B. result is on the Forth integer data stack.

: f>=           \ F: f1 f2 -- ; -- t/f ; one greater or equal the other?
Floating point >=. N.B. result is on the Forth integer data stack.

code fsignbit   \ F: f -- ; -- sign
Return the sign bit of the floating point number. This is not the same as f0< for f=+/-0e0.

: fsign         \ F: r1 -- ; -- sign
Get the sign of floating point r1. The sign is zero for positive numbers and -1 for negative numbers.

: f~            \ F: f1 f2 f3 -- ; -- flag
Approximation function. If f3 is positive, flag is true if abs[f1-f2] is less than f3. If f3 is zero, flag is true if the f2 is exactly equal to f1. If f3 is negative, flag is true if abs[f1-f2] less than abs[f3*abs[f1+f2]].

Words dependent on FP compares

: ?fnegate      \ F: f1 f2 -- f3
Floating point NEGATE.

: fmax          \ F: f1 f2 -- f3
Floating point MAX.

: fmin          \ F: f1 f2 -- f3
Floating point MIN.

FP logs and powers

code flog       \ F: f -- log(f)
Floating point log base 10.

code fln        \ F: f -- ln(f)
Floating point log base e.

code 2**        \ F: f -- 2^f
Floating point: returns 2^F.

code fexp       \ F: f -- e^f ; was called FE^X
Floating point e^f.

: fexpm1        \ F: f -- (e^f)-1 ; 12.6.2.1516
Floating point log base (e^f)-1.

code flnp1      \ F: f1 -- f2
The output f2 is the natural logarithm of the input plus one. An ambiguous condition exists if f1 is less than or equal to negative one.

code falog      \ F: f -- 10^f ; was called f10^f, new name: ans
Floating point anti-log base 10.

code (f**)      \ F: f1 f2 -- f1^f2
Floating point returns f1 raised to the power f2. No error checking is performed. If floating point execeptions are masked, which is the default condition, the system will return a NaN for f1<0.

: f**           \ F: f1 f2 -- f1^f2
Floating point: returns f1 raised to the power f2. If f1<=0e0, 0e0 is returned. This behaviour is required by the Forth Scientific Library.

Rounding

The default rounding configuration is round to nearest.

: fround        \ F: f1 -- f1'
Round the number to nearest or even.

: ftrunc        \ F: f1 -- f1'
Round the number towards zero, returning an integer result on the FP stack.

: fint          \ F: f1 -- f1'
A synonym for FTRUNC. FINT will be removed in a future release.

: floor         \ F: f1 -- f1'
Floored round towards -infinity.

: roundup       \ F: f1 -- f1'
Round towards +infinity.

: rounded       \ -- ; set NDP to round to nearest
Set NDP to round to nearest for all operations other than FINT, FLOOR and ROUNDUP.

: floored       \ -- ; set NDP to floor
Set NDP to round to floor for all operations other than FROUND, FINT and ROUNDUP.

: roundedup     \ -- ; set NDP to round up
Set NDP to round up for all operations other than FROUND, FINT and FLOOR.

: truncated     \ -- ; set NDP to chop to 0
Set NDP to chop to 0 for all operations other than FROUND, FLOOR and ROUNDUP.

code flit       \ F: -- f ; takes floating point number inline
Followed in line by a floating point number (FPCELL bytes) returning this number when executed.

defer fliteral  \ F: f -- ; F: -- f
Compiles a float as a literal into the current definition. At execution time, a float is returned. For example, [ %PI F2* ] FLITERAL will compile 2PI as a floating point literal. Note that FLITERAL is immediate, whereas (RLITERAL) below is not.

: (rliteral)    \ F: f -- ; F: -- f
Compiles a float as a literal into the current definition. At execution time, a float is returned. This is the default action of FLITERAL above.

FP trigonometry

code ftan       \ F: f -- tan(f)
Floating point tangent.

code fatan      \ F: f -- atan(f)
Floating point arctangent.

code fsin       \ F: f -- sin(f)
Floating point sine.

code fasin      \ F: f -- asin(f)
Floating point arcsine.

code fcos       \ F: f -- cos(f)
Floating point cosine.

code facos      \ F: f -- acos(f)
Floating point arctangent.

code fsincos    \ F: f -- sin(f) cos(f)
Returns sine and cosine values of f.

: deg>rad       \ F: fdeg -- frad
Converts a value in degrees to radians.

: rad>deg               \ -- ;
Converts a value in radians to degrees.

code freduce    \ F: f1 -- f2 ; reduce value to range 0..2pi
Reduce f1 to be in the range 0 <= f2 < 2PI.

: fcosec        \ F: f -- cosec(f)
Floating point cosecant.

: fsec          \ F: f -- sec(f)
Floating point secant.

: fcotan        \ f: f -- cot(f)
Floating point cotangent.

: fsinh         \ F: f -- sinh(f) ; (e^x - 1/e^x)/2
Floating point hyberbolic sine.

: fcosh         \ F: f -- cosh(f) ; (e^x + 1/e^x)/2
Floating point hyberbolic cosine.

: ftanh         \ F: f -- tanh(f) ; (e^x - 1/e^x)/(e^x + 1/e^x)
Floating point hyberbolic tangent.

: fasinh        \ F: f -- asinh(f) ; ln(f+sqrt(1+f*f))
Floating point hyberbolic arcsine.

: facosh        \ F: f -- acosh(f) ; ln(f+sqrt(f*f-1))
Floating point hyberbolic arccosine.

: fatanh        \ F: f -- atanh(f) ; ln((1+f)/(1-f))/2
Floating point hyberbolic arctangent.

Number conversion

: 10**n         \ n -- ; -- f
Generate a floating point value 10 to the power n, where n is an integer.

: (>FLOAT)      \ c-addr u -- flag ; F: -- f | --
Try to convert the string at c-addr/u to a floating point number. If conversion is successful, flag is returned true, and a floating number is returned on the float stack, otherwise flag is returned false and the float stack is unchanged.

: >FLOAT        \ c-addr u -- flag ; F: -- f | --
Try to convert the string at c-addr/u to a floating point number. If conversion is successful, flag is returned true, and a floating number is returned on the float stack, otherwise flag is returned false and the float stack is unchanged. Leading and trailing white space are removed before processing. If the resulting string is of zero length, true is returned with a floating point zero. Yes, this is what the standard requires. The previous behaviour without this special case is available as (>FLOAT) above.

FP output

A significant portion of the output code is taken from FPOUT v3.7 by Ed. See

  http://dxforth.webhop.org/

or one of its mirrors.

: precision     \ -- u
Returns the number of significant digits used by F. FE. and FS..

: set-precision \ u --
Sets the number of significant digits used by F. FE. and FS..

: places        \ u --
Sets the number of significant digits used by F. FE. and FS.. The ANS version of this word is SET-PRECISION, which should be used in new code.

: BadFloat?     \ F: f -- ; -- caddr u true | false
If the float is a NaN or Infinite, return a string such as "+NaN" and true, otherwise just return false (0).

: represent     \ c-addr len -- n flag1 flag2 ; F: f --
Assume that the floating number is of the form +/-0.xxxxEyy. Round the significand xxxxx to len significant digits and place its representation at c-addr. If len is zero round the fractional significand to a whole number. If len is negative the fractional significand is rounded to zero. Flag2 is true if the results are valid. N is the signed integer version of yy and flag1 is true if f is negative. In this implementation all errors are handled by exceptions, and so flag2 is always true except for NaNs and Infinites. The number of characters placed at c-addr is the greater of len or MAX-FLOAT-DIGITS. For a Nan or Infinite, a three character non-numeric string is returned.

: (FS.)         \ F: f -- ; n -- c-addr u
Convert float f to a string c-addr/u in scientific notation with n places right of the decimal point.

: FS.R          \ F: r -- ; n u --
Display float f in scientific notation right-justified in a field width u with n places right of the decimal point.

: FS.           \ F: f --
Display float f in scientific notation, with one digit before the decimal point and a trailing space.

: (FE.)         \ F: r -- ; n -- c-addr u
Convert float f to a string c-addr u in engineering notation with n places right of the decimal point.

: FE.R          \ F: r -- ; n u --
Display float f in engineering notation right-justified in a field width u with n places right of the decimal point.

: FE.           \ F: f --
Display float f in engineering notation, in which the exponent is always a power of three, and the significand is always in the range 1.xxx to 999.xxx.

: (F.)          \ F: f -- ; n -- c-addr u
Convert float f to string c-addr/u in fixed-point notation with n places right of the decimal point.

: F.R           \ F: f -- ; n u --
Display float f in fixed-point notation right-justified in a field width u with n places right of the decimal point.

: F.            \ F: f --
Display f as a float in fixed point notation with a trailing space. The ANS specification says that the display is in fixed-point format, but restricted by PRECISION. What should 1e308 display? In this implementation 1e308 displays a 1 followed by 308 zeros. Several people believe that the specification for F. is broken. For a display word that always provides sensible output, use G. below. Convert float f to string c-addr/u with n places right of the decimal point. Fixed-point is used if the exponent is in the range -4 to 5 otherwise scientific notation is used.

: G.R           \ F: f -- ; n u --
Display float f right-justified in a field width u with n places right of the decimal point. Fixed-point is used if the exponent is in the range -4 to 5 otherwise scientific notation is used.

: G.            \ F: f --
Display float f followed by a space. Floating-point is used if the exponent is in the range -4 to 5 otherwise use scientific notation. Non-essential zeros and signs are removed.

: f?            \ addr -- ; displays contents of addr
Displays the contents of the given FVARIABLE.

: f.s           \ F: i*f -- i*f
Display the contents of the floating point stack in a vertical format.

: f.sh          \ F: i*f -- i*f
Display the contents of the floating point stack in a horizontal format.

Patch FP into the system

: isFnumber?    \ caddr len -- 0 | n 1 | d 2 | -2 ; F: -- r
Behaves like the integer version of isNumber? except that if integer conversion fails, and BASE is decimal, a floating point conversion is attempted. If conversion is successful, the floating point number is left on the float stack and the result code is -2.

: Fnumber?      \ caddr -- 0 | n 1 | d 2 | -2 ; F: -- r
As isFnumber? above, but takes a counted string.

: post-float    \ f: f -- ; --
POSTPONE a floating point number. The word being defined will itself compile the given floating point number.

' noop  ' (rliteral)  ' post-float  RecType: r:float    \ -- struct
Contains the interpret, compile and postpone actions for floating point literals.

: rec-float     \ caddr u -- r:float | r:fail ; F: -- [f]
The parser part of the floating point recogniser.

: .FSysPrompt   \ --
Adds floating point stack depth display.

: reals         \ -- ; turn FP system on
Enables the floating point package for number conversion.

: integers      \ -- ; turn FP system off
Disables the floating point package for number conversion.

PFW2.x compatibility

: f#            \ -- f ; or compiles it [ state smart ]
Used in the form "F# <number>", the <number> string is converted and promoted if required to a floating point number. If the system is compiling the float is compiled. If <number> cannot be converted an error occurs. )

Debugging support

Debugging floating point code is often difficult, as failures can occur because of the necessary approximations involved in floating point operations.

If you set the constant [FPDEBUG] true when compiling Ndp387.fth, a debug build will be constructed. The state of the FPU will be checked after each word. If a floating point exception has been generated, a diagnostic is issued, and the system aborts. Set this only when testing, as it slows down the normal operation of floating point words.

The debugger works by intercepting the end of each code definition which is finished by FNEXT, rather than the normal NEXT, or RET. See the source code in *\i{Lib/x86/Ndp387.fth for more details.

: +fpcheck       \ -- ; enable FP checking
Enables the floating point debugger if it has been compiled.

: -fpcheck       \ -- ; disable FP checking
Disbles the floating point debugger if it has been compiled.

Extensions

F.P. stack jugglers

Due to the Mac's usage of fp for all graphic related things, F.P. stack jugglers similar to those for the data stack are handy. We deal with F.P. pairs as used for points, sizes and ranges and F.P. quads for rectangles and colours. F2SWAP F2OVER F2DROP F4DUP FTUCK FNIP do what you expect ...

code F2DUP      \ F: r1 r2 -- r1 r2 r1 r2
Floating point equivalent of 2DUP.

code F2SWAP     \ F: r1 r2 r3 r4 -- r3 r4 r1 r2
Floating point equivalent of 2SWAP.

code F2OVER     \ F: r1 r2 r3 r4 -- r1 r2 r3 r4 r1 r2
Floating point equivalent of 2OVER.

code F2DROP     \ F: r1 r2 --
Floating point equivalent of 2DROP.

code F4DUP      \ F: r1 r2 r3 r4 -- r1 r2 r3 r4 r1 r2 r3 r4
Floating point equivalent of 4DUP.

code FTUCK      \ F: r1 r2 -- r2 r1 r2
Floating point equivalent of TUCK.

code FNIP       \ F: r1 r2 -- r2
Floating point equivalent of NIP.