This document describes the NSM native code interface for each platform that NSM runs on.
The native code interface lets you call platform-specific native code from NewtonScript. As NSM is only a virtual machine and can do very limited input and output by itself, this provides a "bridge to the outside world" and lets NewtonScript code take advantage of the platform it is running on. It is built into NSM and does not require any additional packages.
Each platform has its own interface. ("Platform" here currently means "operating system". The OS X version of NSM is available for PowerPC, x86, and x86-64, yet they all share the same interface.) However, the interfaces are similar in concept, which is why they are all described in the same document.
Note: As NSM is still in the alpha stage, the native code interfaces may change in the future. Backwards compatibility is not guaranteed.
Before native functions can be called, the environment may need to be set up, for example in the package's InstallScript. When native functions are no longer needed, the environment should be cleaned up to free up memory, for example in the package's RemoveScript. This is different for each platform.
No initialization or cleanup needs to be done.
Native code resides in library files, which must be loaded before they can be called and unloaded when they are no longer needed to free up memory.
Each of these platforms has a global function to load and a global function to unload a library. The loading function takes a filename of the library to load and returns a handle to it, which must be later passed to a native function when calling it (see Calling native functions later in this document), or nil
if the library could not be loaded. The unloading function takes a handle returned by the loading function and unloads the library. If an invalid handle is passed to the unloading function, the effect is undefined. It is therefore safest to load the library in the InstallScript, store the handle in a global variable (suffixed by a package and developer signature to ensure uniqueness) and unload it in the RemoveScript.
The loading and unloading functions are named after the platform and the platform's API function that they call. For details on their behavior, such as the order in which libraries are searched, refer to the platform vendor's documentation.
Note: Although the OS X native code interface is the same for PowerPC, x86, and x86-64, you cannot use a library if it has not been compiled for your architecture. However, if you use libraries that you know are available on all architectures (such as OS X's libc.dylib and other libraries), your NewtonScript code will run on all OS X architectures with no extra work. The situation is similar for Haiku, for which GCC 2 and GCC 4 versions of NSM are available, although only x86 is supported.
Native functions are frames of a special format. The order of slots in the frame is significant.
The native function frame contains the actual real mode machine code to be executed. The code must be position-independent, which means that only relative jumps and calls may be used. It is therefore recommended to write it manually in assembly language. (Position-dependent code will be supported in a future release.) The calling convention is described in Calling conventions later in this document.
class is the symbol '|8086Function|
.
instructions is a binary object containing the machine code.
|in| is an array of symbols specifying the datatype of each of the function's parameters. The number of parameters is determined by the length of the array. See Input/output datatypes later in this document.
out is a symbol specifying the datatype of the function's return value.
A native function can be called just like a NewtonScript function - by using "call with", a message send, or as a global function.
The native function frame describes the native function in a library. Since it is separate from the actual library, the same function definition could be used for multiple libraries. This can be useful for implementing extension functionality.
class is 'Win32Function
on Win32, '|OS/2Function|
on OS/2, 'OSXFunction
on OS X, and 'HaikuFunction
on Haiku.
name is a symbol specifying the name of the function in the library. On Win32, make sure to include the "A" or "W" suffix for functions that work with strings, for example 'CreateFileA
. On OS/2, name can also be an integer specifying the ordinal number of the function as some libraries, for example DOSCALLS, can only be called by ordinal.
|in| is an array of symbols specifying the datatype of each of the function's parameters. The number of parameters is determined by the length of the array. See Input/output datatypes later in this document.
out is a symbol specifying the datatype of the function's return value.
A native function can be called just like a NewtonScript function - by using "call with", a message send, or as a global function. However, the first parameter, before the function's actual parameters (as specified by the |in|
slot), must be the handle returned by the library loading function. If it is not, or it is a handle that has been already used in a call to the library unloading function, the effect is undefined.
When calling a native function, NSM automatically converts parameters and the return value between NewtonScript datatypes and the datatypes used by the native function. Each native datatype has a symbol identifying it. The following list describes all currently supported datatypes. The equivalent datatype in the C programming language is also listed.
'binary
: On input, converts a binary object to the address of its contents (in C, this would be a void*
), or, if the input is nil
, NULL
. Not available for output. On DOS, the address is a normalized far pointer; the segment is as high as possible and the offset is in the range 0 to 15. In the case of nil
, the segment and offset are both 0.'char
: Converts between a NewtonScript character and an unsigned 8-bit integer (char
). On input, only the low 8 bits of the character code are used, the rest are ignored. On output, the value is zero-extended.'handle
: A handle or pointer (void*
). Can only be created as the output of a native function, but may be used as input to another. You can also use nil
as input to represent NULL
. On DOS, this is a far pointer.'int8
: A signed 8-bit integer (int8_t
). On input, only the low 8 bits are used. The output is sign-extended.'int16
: A signed 16-bit integer (int16_t
). On input, only the low 16 bits are used. The output is sign-extended.'int32
: A signed 32-bit integer (int32_t
). The input is sign-extended from a 30-bit NewtonScript integer. The output has the upper two bits stripped.'uint8
: An unsigned 8-bit integer (uint8_t
). On input, only the low 8 bits are used. The output is zero-extended.'uint16
: An unsigned 16-bit integer (uint16_t
). On input, only the low 16 bits are used. The output is zero-extended.'uint32
: An unsigned 32-bit integer (uint32_t
). The input is zero-extended from a 30-bit NewtonScript integer. The output has the upper two bits stripped.'void
: Can only be used as output. Use this if the native function does not return anything (void
). The value returned to the NewtonScript code calling the function is undefined.This section describes how NSM transfers values between NewtonScript code and the native function. Unless you are writing a native function in assembly language, you probably do not have to read this.
The "Pascal" calling convention is used. Parameters are pushed on the stack from left to right. 8-bit parameters are pushed as words, but only the low 8 bits are valid. 32-bit parameters are pushed as two words; first the low 16 bits, then the high 16 bits. For far pointers, the segment is pushed first.
The result is returned in AL, AX, or DX:AX, depending on the size.
The function must preserve SS and SP and return with a RETF instruction that cleans up the stack (if it received any parameters).
The platform's usual API calling convention is followed.
The platform's usual API calling convention is followed.
On x86 and x86-64, the stack is aligned to 16 bytes before the function is called.