Working with Files

Source file names

The following words are useful when writing your own tools.

: .SourceName   \ ^SFSTRUCT --
Given a source file structure such as that held by the variable 'SourceFile display the current file name.

: CurrSourceName        \ -- c-addr u
Returns the current source file name without expanding any text macros.

: stripFilename \ cstring --
The input is a counted string containing a full path and filename e.g. "C:\WINDOWS\SYSTEM32\COMMAND.COM". The file name is removed to leave "C:\WINDOWS\SYSTEM32". Note that the actual directory separator used depends on the host operating system.

ANS File Access Wordset

The basis for all file operations comes from the ANS specification wordset for Files. The following group of definitions are implementations of the ANS standard set.

The following data types are used:

fam

"File Access Method", describes read/write permission etc.

ior

"IO Result", A return result from most IO calls, this value is 0 for success or non-zero as an error-code.

fileid

"File Identifier", a handle for a file.

: bin           \ fam -- 'fam
Modify a file-access method to include BINARY.

: r/o           \ -- fam
Get ReadOnly fam

: w/o           \ -- fam
Get WriteOnly fam

: r/w           \ -- fam
Get ReadWrite fam

: Create-File   \ c-addr u fam -- fileid ior
Create a file on disk, returning a 0 ior for success and a file id. Macro names are expanded before the operating system file create call is made.

: Open-File     \ c-addr u fam -- fileid ior
Open an existing file on disk. Macro names are expanded before the operating system file open call is made.

: ?Relative-Open-File   \ c-addr u fam -- fileid ior
Open an existing file on disk. Macro names are expanded before the operating system file open call is made. If the first two characters of the file name are './' the file path is taken to be relative to the directory of the containing include file.

: Close-File    \ fileid -- ior
Close an open file. Use correct method for VFCACHED files.

: Write-File    \ caddr u fileid -- ior
Write a block of memory to a file.

: write-line    \ c-addr u fileid -- ior
Write data followed by EOL. IOR=0 for success. Note that the end of line sequence is given by EOL$ and is operating system dependent.

: Read-File     \ caddr u fileid -- u2 ior
Read data from a file, use VF-CACHE Version where appropriate. The number of characters actually read is returned as u2, and ior is returned 0 for a successful read.

: read-line     \ c-addr u1 fileid -- u2 flag ior       11.6.1.2090
Read an ASCII line of text from a file into a buffer, without EOL. Read the next line from the file specified by fileid into memory at the address c-addr. At most u1 characters are read. Up to two line-terminating characters may be read into memory at the end of the line, but are not included in the count u2. The line buffer provided by c-addr should be at least u1+2 characters long.

If the operation succeeds, flag is true and ior is zero. If a line terminator was received before u1 characters were read, then u2 is the number of characters, not including the line terminator, actually read (0 <= u2 <= u1). When u1 = u2, the line terminator has yet to be reached.

If the operation is initiated when the value returned by FILE-POSITION is equal to the value returned by FILE-SIZE for the file identified by fileid, flag is false, ior is zero, and u2 is zero. If ior is non-zero, an exception occurred during the operation and ior is the I/O result code.

An ambiguous condition exists if the operation is initiated when the value returned by FILE-POSITION is greater than the value returned by FILE-SIZE for the file identified by fileid, or if the requested operation attempts to read portions of the file not written.

At the conclusion of the operation, FILE-POSITION returns the next file position after the last character read.

: file-size     \ fileid -- ud ior
Get size in bytes of an open file as a double number, and return ior=0 on success.

: file-position \ fileid -- ud ior
Return file position, and return ior=0 on success.

: Reposition-File       \ ud fileid -- ior
Set file position, and return ior=0 on success.

: Resize-File   \ ud fileid -- ior
Set the size of the file to ud, an unsigned double number. After using RESIZE-FILE, the result returned by FILE-POSITION may be invalid. Note that for a VF-CACHEd file, this operation is performed on the underlying physical file.

: delete-file   \ caddr len -- ior
Delete a named file from disk, and return ior=0 on success. Text macros will be expanded before the file is opened.

: FileExists?   \ caddr len -- flag
Look to see if a specified file exists, returning TRUE if the file exists. Text macros are expanded.

: RelFileExists?        \ caddr len -- flag
Look to see if a specified file exists, returning TRUE if the file exists. A './' prefix is treated as a relative file. Text macros are expanded.

: FileExist?    \ caddr len -- flag
Use FileExists? above. OBSOLETE, WILL BE REMOVED.

: RelFileExist?    \ caddr len -- flag
Use RelFileExists? above. OBSOLETE, WILL BE REMOVED.

: file-status   \ caddr len -- x ior                     11.6.2.1524
Return the status of the file identified by the character string c-addr/len. If the file exists, ior is zero; otherwise ior is the implementation-defined I/O result code. X contains implementation-defined information about the file (always zero for VFX Forth).

: rename-file   \ caddr1 len1 caddr2 len2 -- ior         11.6.2.2130
Rename the file named by the character string c1addr/len1 to the name in the character string caddr2/len2. Ior is the I/O result code.

: flush-file    \ fileid -- ior
Flush changed file data to disk, and return ior=0 on success.

: include-file          \ file-id --
Include source code from an open file whose file-id (handle) is given. The file is closed by INCLUDE-FILE.

: included      \ c-addr u --
Include source code from a file whose name is given by c-addr/u. Text macros will be expanded before the file is opened.

: include       \ "<name>" --
A more convenient form of INCLUDED. Use in the form:

  INCLUDE <name>

Text macros will be expanded before the file is opened. See GetPathSpec for a discussion of file name formats including spaces.

: required      \ c-addr u --
If the file specified by c-addr/u has already been INCLUDED, discard c-addr/u; otherwise, perform the function of INCLUDED. You must provide the source file's extension.

: require       \ "<name>" --
Skip leading white space and parse name delimited by a white space character. Put the address and length of the name on the stack and perform the function of REQUIRED. You must provide the source file's extension.

: data-file     \ -- size ; DATA-FILE <filename>
Loads a file to memory at HERE and ALLOTs memory. The size of the file is returned. This is a good way to load data directly into the dictionary at compile time. It avoids having to convert binary data into streams of digits and commas. For example, DocGen keeps a CSS file in the dictionary:


CREATE BootstrapAddr    \ --
  data-file bootstrap.min.css     \ load the file
constant /Bootstrap               \ keep the length

File Caching

VFX Forth supports memory caching of read-only files. Any file which is to be cached is opened using VF-OPEN-FILE rather than the ANS word OPEN-FILE. The normal ANS wordset can then be used with re-vectoring being automatic. The control directive +VFCACHE (see later) enables INCLUDE and friends to use file caching automatically, which decreases compilation time for larger projects.

: IsFileIDCached?       \ fileid -- flag
Determine if an open file referenced by FILEID is a cached file.

: VF-Open-File          \ caddr len fam -- fileid ior
Open a file using VFCACHE Mode. This means read the whole file into memory.

: VF-Close-File         \ fileid -- ior
Close a VFCACHED file, i.e. free its memory.

: VF-Read-File          \ caddr u fileid -- u2 ior
Read into a buffer from a VFCached file.

: Mem-Open-File \ c-addr u fam -- fileid ior
Open a memory block caddr/u using VFCACHE mode. Fam is ignored. When this file is closed, no attempt is made to FREE caddr/u.

: IncludeMem    \ c-addr u --
Include source code from a memory buffer. Errors cause a THROW.

"Smart File" Inclusion

Any pathname used to include source from a text-file passes through the Smart File filter. This code attempts to resolve the file extension for a name passed to it. The resolve algorithm looks for the file path as specifed, then with a number of common file extensions. See the ResolveIncludefilename definition below. If no match is found then the original name is passed back.

TRUE value bSmartFileLookUp?
When non-zero, the smart file filter is enabled. See also +SMARTINCLUDE and -SMARTINCLUDE which should be used to control the smart file filter.

: dirChar?      \ char -- flag
Returns true if the character is one of the two directory separators specified in the system variables DIR1-CHAR and DIR2-CHAR.

: Extension?    \ c-addr u -- len true | false
Treats c-addr/u as a file name and returns the extension length and true if the file name has an extension (i.e. it ends in '.xxx'), or just false if no extension is present. The extension can be of any length (including 0) as names of the form "name." are treated as having an extension. Unfortunately such names can exist. A name of zero length returns false.

: ChangeEXT3    \ c-addr u c-addr1 u1 -- c-addr u
Change the last 3 characters of the string at c-addr u to use the text at c-addr1 u1 (where u1 is always 3).

: ResolveIncludeFileName        \ c-addr u -- c-addr u
Given what may be a extension-less filename attempt to locate a matching file and return its string description. Note that the returned string is built at HERE. Matching rules are:

The extensions ".BLD" ".FTH" ".F" ".CTL" ".SEQ" are searched for in that order. For case-sensitive file systems, lower case extensions are tried before upper case. Mixed case is not attempted.

Source File Tracking

VFX Forth automatically keeps track of compiled source files. Whenever a new source is compiled into the system, the file location and dictionary impact is recorded. One use of this system is LOCATE specified below which can attempt to find the source for a definition and automatically load it into your favourite editor for review.

Many users keep their source code in a path (directory or folder) with all the files being loaded by a control file which contains many lines of the form:


include part1\petrol
include part1\gas
include part2\forms
include part2\recalculate
...

If the source code is moved, for example to a laptop, the new path may be different and LOCATE and friends may then fail. In order to cope with this, additional tracking text can be added at the start of the file name. This text is usually a macro name. What text is added is controlled by the value BuildLevel and the macro DEVPATH.

If BuildLevel is set to 0, no additional information is added. If BuildLevel is set to -1, the contents of the macro DEVPATH are prepended to the file name. Do not set BuildLevel to any other values!

DEVPATH may itself contain a macro name. LOCATE expands macros before attempting to open the file. This enables you to partition an application across several build phases, and still be able to LOCATE words when the tree structures have been moved or modified.

0 value BuildLevel      \ -- n
Used to control what is added to the start of the file name for source file tracking. See above for more details.

: +source-files \ --
Enable source file tracking.

: -source-files \ --
Disable source file tracking.

defer sourceTrackRename \ zaddr --
A hook so that names for the source file tracking system can be updated to suit user habit. The input zaddr is a pointer to a buffer containing a zero-terminated file name. The updated name must be returned in the same buffer. The buffer is of size MAX_PATH bytes. The default action is drop.

: AddSourceFile \ c-addr u -- 'c-addr 'u ^SFSTRUCT | c-addr u -1
Add a source file to the tracking vocabulary. caddr/u represents the pathname supplied to INCLUDED.

: (whereis)     \ xt -- c-addr u line# TRUE | FALSE
Given the XT of a word this will return the filename string, the line number and TRUE for the definition. If the xt cannot be found, just a 0 is returned.

: whereis       \ -- ; WHEREIS <name>
Use in the form WHEREIS <name> to find the source location of a word.

: source-info   \ c-addr u -- start end size true | false
Return dictionary start/end and binary size of a compiled source file from a string. Returns FALSE only if the source name was not recognized.

defer .locate   \ --
Perform the desired action of LOCATE below. The LOCATE_PATH and LOCATE_LINE macros have been set up.

defer .nolocate \ --
Perform the action of LOCATE below when the word has been found but has no source information.

: LocateInfo    \ caddr u line# --
Set the locate macros using caddr/u as the file name and line# as the line number. The file name is expanded.

: locate        \ <"name"> --
Use in the form LOCATE <name> and display its source code. This word is redefined by the Windows Studio environment.

: .sources      \ --
Display list of sources used in build so far, includes size, source file name and dictionary pointers.

Control Directives

The following words can be used to control the filesystem extensions.

: +VFCACHE              \ --
Enable caching of read-only files when opened.

: -VFCACHE              \ --
Disable caching of read-only files.

: +SMARTINCLUDE         \ --
Enable smart resolution of file extensions when including sources.

: -SMARTINCLUDE         \ --
Disable smart resolution of file extensions when including sources.

: +VERBOSEINCLUDE       \ --
Enable verbose mode for file includes and overlay handling.

: -VERBOSEINCLUDE       \ --
Disable verbose mode for file includes and overlay handling.