Linux TCP and UDP Socket Bindings





Defines a socket wordset to be used in a variety of situations. The main I/O words have been defined with a similar stack effect as the file wordset. The ior codes are throwable.

Choosing names for this BDS style socket API has been problematic because it uses common words such as bind or close that could clash with other applications or wordlists. I have opted to append an s (for socket) to most of these words to keep them readable, concise and avoid name clashes. So close becomes closes and bind becomes binds.

Regarding portability, There has been no special effort to produce a portable library, only to make it work. It would probably be quite easy to port it to GForth. Some asumptions taken:




Non-blocking sockets and multitasking

A non-blocking I/O wordset has been included to use together with cooperative multitaskers. Since VFX Forth for Linux includes a preemptive multitasker, blocking I/O can be used within each task and the non-blocking counterparts have, in principle, little interest. However, facilities like messages between tasks or task events are driven using the traditional pause word. If you plan to use such facilities, then you must use the non-blocking words to ensure that pause is called frequently enough.




Glossary

Handling Linux errno codes

Socket I/O uses a variety of system calls that may fail for a number of reasons. The following words help keep track of errno codes thrown as exceptions.

As far as I know, VFX do not have yet embedded all the errno codes as part of its error handling wordset, nor do not offer an easy way to automatically read the z-strings returned by strerror() and define them with #AnonErr. All of VFX's error defining words are parsing words.

Note: This section is likely to change as soon as possible.

-5000 Value errno-base
Base value for OS errno throw codes. Can be customized at runtime to coexist with other throw codes.

: errno>excp                 \ errno -- n1
Converts Linux errno values in throw codes.

: excp>errno                \ n1 -- errno
Converts throw code back to genuine Linux errno values.

AliasedExtern: errno int * __errno_location(void);
errno is the well known errno C global variable used by libraries and system calls. Actually, it is a thread local (aka USER) variable. Can be read by @ and written by !

: throwable                  \ flag -- -ior
Takes a boolean flag and makes it a throwable ior with the False makes a zero ior.

Extern: char * strerror(int errnum);    \ -- z-addr
Get the human readable errno string. See man strerror for details.

: .errno              \ n1 --
Given the Linux n1 errno number prints the associated z-string. A factor for the word below.

: .errno-excp           \ n1 --
Prints the errno exception value thrown in n1. n1 must be < errno-base.

Handling IP addresses

The following words help using and declaring IP addresses.



Reusable bits

struct /hostent                         \ -- len
Mimics the Linux struct hostent. See man gethostbyname.

struct /sockaddr_in                     \ -- len
Mimics the Linux struct sockaddr_in. See man ip.

2  Constant PF_INET
IP sockets family.

0  Constant INADDR_ANY
Used to listen to any IP address of a host.

-1 Constant INADDR_BROADCAST
The special broadcast address 255.255.255.255 encoded as binary.

The following constants are possible values left in h_errno C global variable. Comments are extracted form the <netdb.h> header file.

-1 Constant NETDB_INTERNAL
Internal error. See errno.

0 Constant NETDB_SUCCESS
No problem.

1 Constant HOST_NOT_FOUND
Authoritative Answer Host not found.

2 Constant TRY_AGAIN
Non-Authoritative: Host not found, or SERVERFAIL.

3 Constant NO_RECOVERY
Non recoverable errors, FORMERR, REFUSED, NOTIMP.

4 Constant NO_DATA
Valid name, no data record of requested type.

4 Constant NO_ADDRESS
No address, look for MX record. Currently, the same value as NO_DATA.

AliasedExtern: h_errno int * __h_errno_location (void);
h_errno is the h_errno C global variable. Actually, it is a thread local (aka USER) variable. Can be read by @ and written by !

AliasedExtern: ux-gethostbyname void *  gethostbyname(const char * name);
See man gethostbyname.

Extern: uint32  htonl(uint32 hostlong);
See man htonl.

Extern: uint16  htons(uint16 hosthort);
See man htons.

: c-string                       \ c-addr1 u -- c-addr2
Converts a Forth string into an null terminated C string. The z-string c-addr2is stored in the PAD. Taken from GForth.

: gethostbyname                           \ c-addr1 u -- addr2 ior
Get a struct hostent from a symbolic name or IP dotted notation Returns the /hostent structure base address addr2 and an ior. Negative ior values can be thrown as exceptions. Positive ior values denotes one of HOST_NOT_FOUND, TRY_AGAIN, NO_RECOVERY, NO_DATA NO_ADDRESS failed search results errors. addr2 is undefined if ior <> 0.

: host>addr                 \ c-addr1 u -- u1
Converts an Internet host name into a binary u1 IPv4 address. The resulting address is in network byte order. This word may throw errno exceptions if internal errors are detected. It may also call abort" with the texts shown in the constants glossary above if the search results are one of the following HOST_NOT_FOUND, TRY_AGAIN, NO_RECOVERY, NO_DATA NO_ADDRESS values.

: port!                   \ port addr1 --
Fills int the /sockaddr_in structure or EndPoint: given by addr1 with the PF_INET protocol family and port.

struct /endpoint                        \ -- addr
Extends the /sockaddr_in structure with a ep.rlen field for convenience. This field is a convenient placeholder for storing returned length of a /sockaddr_in structure in various Linux system calls. Allocation of /endpoint structures can be made with allocate or EndPoint: (recommended).



Main words

: EndPoint:                     \ "<name>" -- addr1
Defining word to create an IP end point. The /sockaddr_in structure address addr1 is returned just to be initialized by words known-ip, unknown-ip any-ip or broadcast-ip.
Examples:



 s" localhost" 7624 EndPoint: remote known-ip \ for clients
 7624 EndPoint: local any-ip                  \ for servers
 EndPoint: receiver unknown-ip                \ to use in recvsfrom

: known-ip                              \ c-addr1 u1 port addr2 --
An initializer to a /sockaddr_in given by addr2 that declares a well known port plus c-addr1 u1 IP address, typically used by clients to connect to remote peers or servers. Host string can be either a symbolic name or a IP dotted notation string. This word calls the resolver host>addr word, so it may throw errno exceptions.

: any-ip                   \ port addr1 --
An initializer to a /sockaddr_in given by addr1 that declares an end point ready to listen to a given port but any IP address. Typically used by servers.

: broadcast-ip                   \ port addr1 --
An iniializer to a /sockaddr_in given by addr1 that declares an end point ready to send broadcast messages to a given port. Can be used by clients or in UDP based application protocols to announce and discover.

: unknown-ip               \ addr1 --
An initializer to a /sockaddr_in given by addr1 that declares an uninitialized end point to be filled by binds, accepts or recvsfrom words. Syntactic sugar.

TCP/IP Sockets



Reusable bits

Description of most words in this section is available by generating the additional detailed level of documentation.

AliasedExtern: ux-socket int socket(int domain, int type, int protocol);
See man socket.

AliasedExtern: ux-bind int bind(int fd, void * my_addr, int addrlen);
See man bind.

AliasedExtern: ux-listen int listen(int fd, int backlog);
See man listen.

AliasedExtern: ux-connect int connect(int fd, const int * serv_addr, int addr_len);
See man connect.

AliasedExtern: ux-accept int accept(int fd, void * rem_addr, int * addr_len);
See man accept.

AliasedExtern: ux-close int close(int fd);
See man close.

AliasedExtern: ux-send int send(int s, const void * msg, int len, int flags);
See man send.

AliasedExtern: ux-recv int recv(int s, void * msg, int len, int flags);
See man recv.

Extern: int setsockopt(int fd, int level, int opt, void * val, int vallen);
See man setsockopt.

1 Constant SOL_SOCKET
Select socket level in get/setsockopt()

2 Constant SO_REUSEADDR
Socket level option to reuse address. Typically used in server sockets.

1 Constant SOCK_STREAM
Stream type socket.

2 Constant SOCK_DGRAM
Datagram type socket.

: reuse-IP               \ fd -- ior
Configure a socket fd to reuse its IP address through setsockopt. Usually done at servers. Returns a throwable ior.



Main words

#6  Constant TCP
Used to create a TCP socket with socket. Same value as Linux #define IPPROTO_TCP.

#17 Constant UDP
Used to create an UDP socket with socket. Same value as Linux #define IPPROTO_UDP.

: socket                  \ n1 -- fd ior
Creates a TCP or UDP socket given the constant TCP or UDP passed in n1. Returns a descriptor fd and throwable ior. fd is undefined when the ior is not zero. Socket I/O on the returned fd is blocking by default. See man socket.

: connects                   \ addr fd -- ior
Client connects to a remote socket server. addr is a remote /sockaddr_in or /endpoint structure. Returns a throwable ior.

: binds                   \ addr fd -- ior
Binds the server socket to a port and address (ANY address usually). Reuse port+IP address if possible. addr can be either a /sockaddr_in or a /endpoint structure. Returns a throwable ior.

#5 Value BackLog
Backlog of pending connections to listens.

: listens              \ fd -- ior
Server socket fd listens to incoming connections. Returns a throwable ior. See man listen.

: accepts                        \ addr1 fd1 -- fd2 ior
Accept an incoming socket connection. Returns a new socket descriptor fd2 and a throwable ior. addr1 must be a defined an /endpoint struct to be filled in with the incoming client IP address & port. fd2 is undefined when ior is non-zero. Socket I/O on the returned fd2 is blocking by default. See man accept.

: closes           \ fd -- ior
Close the socket. Returns a throwable ior. See man close.

$4000 Constant MSG_NOSIGNAL
Transmission/reception flag to make <recvs> or <sends> not to throw a SIGPIPE signal when the other end closes a connection. Only applicable to connection oriented sockets. See man send or man recv.

$02 Constant MSG_PEEK
Reception flag to make <recvs> or <recvsfrom> peek at incoming data without dequeueing from system internal buffers. See man recv.

$100 Constant MSG_WAITALL
Reception flag to make <recvs> or <recvsfrom> block until all requested bytes have been read. See man recv.

: <sends>                                 \ c-addr1 +n1 fd n2 -- +n3 ior
Send n1 bytes starting from c-addr1 through socket given by fd. Transmission flags must be or-ed in a single n2. The actual amount sent is returned in n3. In certain conditions, n3 can be < n1. Returns a throwable ior. n3 is undefined if ior is non-zero. See man send.

: sends                           \ c-addr1 +n1 fd -- +n2 ior
The common use case for transmission. SIGPIPE is disabled. If the remote end closes connection, ior will contain a EPIPE throwable errno code. See man send. Defined as :

   MSG_NOSIGNAL <sends> ;

: <recvs>  \ c-addr1 +n1 fd n2 -- +n3 ior
Receive through socket given by fd an amount of n1 bytes and copy them into c-addr1. Reception flags must be or-ed in a single n2. Returns the actual amount received in n3 and a throwable ior. If there is no data, the socket will block until something is available. If the socket is non-blocking and there is no data, ior will contain a EAGAIN throwable exception. See man recv.

#32 Constant EPIPE
errno code returned when the other end closes a connection.

: epipe?                                  \ #read ior1 -- #read ior2
Generate an artificial, throwable EPIPE errno code when the remote end closes the connection and signals are disabled. This happens when <recvs> returns 0 0.

: recvs                           \ c-addr1 +n1 fd -- +n2 ior
The common use case for reception. If the remote side closes the connection, recvs generates an artificial errno code through epipe?. Defined as :

   MSG_NOSIGNAL <recvs> epipe? ;

: recvs-all                           \ c-addr1 +n1 fd -- +n2 ior
Blocks until all n1 bytes are received. If the remote side closes connection, \fo{recvs-all} generates an artificial errno code through epipe?. Defined as :

   [ MSG_WAITALL MSG_NOSIGNAL or ] literal <recvs> epipe? ;

UDP Sockets

This section defines additional words to handle UDP sockets.



Reusable bits

AliasedExtern: ux-sendto int sendto(int, const void *, int, int, const void *, int);
See man sendto.

AliasedExtern: ux-recvfrom int recvfrom(int, void *, int, int, void *, int *);
See man recvfrom.



Main words

: sendsto                            \ c-addr1 +n1 addr2 fd -- +n2 ior
Send n1 bytes starting from c-addr1 through socket given by fd to an /endpoint or /sockaddr_in given by addr2. n2 is the actual amount of data sent. Returns a throwable ior. n2 is undefined if ior is non-zero. See man sendto.

: <recvsfrom>                { c-addr1 n1 addr2 fd n2 -- +n3 ior }
Receive data from a remote /endpoint addr2. n1 is the amount to read through socket fd and c-addr1 is the receiving buffer. Recepcion flags in n2 must be or-ed together. Returns the actual amount received in n3 and a throwable ior. See man recvfrom.

: recvsfrom                          \ c-addr1 +n1 addr2 fd -- +n2 ior
The common use case for reception. Equivalent to <recvsfrom> when n2 = 0. See <recvsfrom>.

: recvsfrom-all                      \ c-addr1 +n1 addr2 fd -- +n2 ior
Blocks until all n1 bytes are received. Equivalent to <recvsfrom> when n2 = MSG_WAITALL flag. See <recvsfrom>.

Non-blocking connection-oriented socket I/O

The following words are intended for use with a cooperative multitasker using pause and non-blocking sockets, although they also work with blocking sockets without multitasker. All xxx-mt words are the multitasked counterparts of xxx words.

NOTE 03Jan07: xxx-mt words have been refactored to avoid throwing exceptions from the inside words. The internal ior codes have been proagated to the outermost word instead. This has caused a lot of headaches and have added significant complexity to xxx-mt words.



Reusable bits

Extern: int fcntl(int fd, int cmd, LONG arg);
See man fcntl.

3 Constant F_GETFL
File control 'get flags' command. See man fcntl.

4 Constant F_SETFL
File control 'set flags' command. See man fcntl.

$800 Constant O_NONBLOCK
Mark a file descriptor as non-blocking. To perform a multitasked I/O, instead of the usual multiplexed I/O using select(). See man open.

: fcntl@                \ fd -- n1 ior
Get the socket file control flags (only handles O_APPEND, O_ASYNC, O_NONBLOCK, O_DIRECT). Returns a throwable ior. n1 is undefined if ior <> 0. See man fcntl.

: fcntl!                \ n1 fd -- ior
Set the socket file control flags (only handles O_APPEND, O_ASYNC, O_NONBLOCK, O_DIRECT). n1 is the new flag value. Returns a throwable ior. See man fcntl.

11  Constant EAGAIN                     \ Try Again
errno code returned for non-blocking I/O in accepts sends recvs sendsto recvsfrom.

114 Constant EALREADY
errno code returned for non-blocking socket when performing connects when former connects request is still in progress.

115 Constant EINPROGRESS                \ Operation now in progress
errno code returned for non-blocking socket when performing connects request that would need to wait for completion.

: again?                    \ ior1 -- ior2 flag
Process ior1 to filter the EAGAIN value. In this case, flag is true and ior2 is zero. Any other ior1 will produce the same value in ior2 and a false flag.

: in-progress?                    \ ior1 -- ior2 flag
Process ior1 to filter the EINPROGRESS or EALREADY values when using connects. In this case, flag is true and ior2 is zero. Any other ior1 will produce the same value in ior2 and a false flag. Please note that connects can return EAGAIN but this denotes a different error situation (see man connect.)



Main words

: -blocking               \ fd -- ior
Configure the socket as non blocking. Returns a throwable ior.

: +blocking               \ fd -- ior
Configure the socket as blocking. Returns a throwable ior.

: peeks?                 \ fd -- flag ior
Peek into the socket system buffer to check for data available. Returns a true flag if one or more bytes are waiting to be read and a throwable ior.

1 Value #ms-delay
poll period in non-blocking xxx-mt I/O words below setting this value to 0 increases the CPU load. The default value is enough to produce a negligible load but with a reduced performace.

: delayed           \ --
By using this word, the caller is delayed #ms-delay ms. and allows task switching if multitasking is enabled.

: connects-mt                   \ addr1 fd1 -- ior
Non-blocking I/O counterpart of connects. This word loops over connects and uses internally delayed.

: accepts-mt                   \ addr1 fd1 -- fd2 ior
Non-blocking I/O counterpart of accepts. This word loops over accepts and uses internally delayed.

: (sends-mt)               \ c-addr1 +n1 fd ior1 -- c-addr2 n2 fd ior2
This word is an internal factor of sends-mt for non-blocking I/O and should not be used. Tries to send n1 bytes to socket given by fd and loops over until the socket says it would not block. Returns the remaining buffer c-addr2 n2 to be sent.

: (recvs-mt)               \ c-addr1 n1 fd ior1 -- c-addr2 +n2 fd ior2
This word is an internal factor of recvs-mt for non-blocking I/O and should not be used. Tries to receive n1 bytes to socket given by fd and loops over until the socket says it would not block. Returns the remaining buffer c-addr2 n2 to be received.

: #bytes-io                           \ c-addr1 n1 x1 ior n2 -- n3 ior
This word is a factor for some loop epilogs below and should not be used. Returns the number of bytes processed in I/O in n3 from the total requested n2 and the remaining amount n1. Reorders the result to place ior on TOS. x1 is a don't care cell left in the loop exist that gets dropped in the way.

: sends-mt                            \ c-addr1 n1 fd -- n2 ior
Non-blocking I/O counterpart of sends. This word loops over sends and uses internally delayed.

: recvs-mt                            \ c-addr1 +n1 fd -- +n2 ior
Non-blocking I/O counterpart of recvs. This word loops over recvs and uses internally delayed.

: halt-loop?   \ n2 fd ior -- n2 ior fd flag
This word is an internal factor of recvs-all-mt. Do not use. This word factors the loop end detection while preserving stack ordering. Returns true if ior < 0 or n2 = 0, false otherwise.

: recvs-all-mt                        \ c-addr1 +n1 fd -- +n2 ior
Non-blocking I/O counterpart of recvs-all. This word loops over recvs and uses internally delayed.

Non-blocking connectionless socket I/O

The following words are intended for use with a cooperative multitasker using pause and non-blocking sockets, although they also work with blocking sockets without multitasker. All xxx-mt words are the multitasked counterparts of xxx words.



Reusable bits

: lover                           \ x1 x2 x3 -- x1 x2 x3 x1
"long" over, although the name actually suggest something different :-)

: halt2-loop?                \ n2 addr2 fd ior -- n2 addr2 ior fd flag
This word is an internal factor of recvs-all-mt and recvsfrom-all-mt. Do not use. This word factors the loop end detection while preserving stack ordering. Returns true if ior < 0 or n2 = 0, false otherwise.

: (sendsto-mt)  \ c-addr1 +n1 addr2 fd ior -- c-addr3 +n2 addr2 fd ior
This word is an internal factor of sendsto-mt for non-blocking I/O and should not be used. Tries to send n1 bytes to socket fd and loops over and over again until socket says it would not block. Returns the remaining buffer c-addr3 n2 to be sent. addr2 denotes the destination IP /endpoint. ior is propagated from sendsto.

: (recvsfrom-mt)   \ c-addr1 +n1 addr2 fd ior -- c-addr1 +n2 addr2 fd ior
This word is an internal factor of recvsfrom-mt for non-blocking I/O and should not be used. Tries to receive +n1 bytes into buffer c-addr1 from socket fd and loops over and over again until socket says it would not block. Returns the remaining buffer c-addr3 n2 to be received. addr2 denotes the destination IP /endpoint. ior is propagated from recvsfrom.



Main words

: sendsto-mt                         \ c-addr1 +n1 addr2 fd -- +n2 ior
Non-blocking I/O counterpart of sendsto. This word loops over sendsto and uses internally delayed.

: recvsfrom-mt                       \ c-addr1 +n1 addr2 fd -- +n2 ior
Non-blocking I/O counterpart of recvsfrom. This word loops over recvsfrom and uses internally delayed.

: recvsfrom-all-mt                   \ c-addr1 +n1 addr2 fd -- +n2 ior
Non-blocking I/O counterpart of recvsfrom-all. This word loops over recvsfrom-all and uses internally delayed.

Warning: Unpredictable results are obtained if several simultaneous clients send their data to the same connectionless socket in the receiving side. Data will be mixed in the same buffer and a wrong byte count will be reported.

Multiplexed I/O

An alternative to multitasked I/O when blocking sockets are used. The poll() system call is easier to implement than select().



Reusable bits

Internal documentation.

AliasedExtern: ux-poll int poll(void * ufds, unsigned int nfds, int timeout);
See man poll.



Main words

$0001 Constant POLLIN
On pollfd.events: notify when data ready. On pollfd.revents: data ready to read.

$0004 Constant POLLOUT
On pollfd.events: notify when write will not block. On pollfd.revents: write will not block.

$0008 Constant POLLERR
On pollfd.revents: Error condition.

$0010 Constant POLLHUP
On pollfd.revents: Hangup. Remote socket closed.

$0020 Constant POLLNVAL
On pollfd.revents: Invalid request. File descriptor is not open.

struct /pollfd                          \ -- len
Mimics the Linux struct pollfd. See man poll.

   1 cells field pollfd.fd              \ file descriptor
   2 chars field pollfd.events          \ requested events
   2 chars field pollfd.revents         \ returned events
end-struct

: polls                                   \ addr1 n1 n2 -- n3 ior
Poll the system for I/O events. An array of /pollfd structures is passed starting at addr1. n1 is the array length in structure units (not bytes). n2 is a timeout in milliseconds. n2 < 0 indicates an infinite timeout. n3 is the number of structures having pollfd.revents <> 0 If n3 = 0, a timeout has occured. If n3 < 0, an internal error has occured and ior contains a throwable errno code. See man poll.
NOTE: Tests made shows that when a remote side closes the socket poll returns with an POLLIN event instead of POLLHUP. Performing a recvs on that socket returns a zero length.