SWI-Prolog -- Unifying data

Documentation
- Reference manual
  - Foreign Language Interface
    - The Foreign Include File
- Packages

12.4.6 Unifying data

The functions of this section unify terms with other terms or translated C data structures. Except for PL_unify(), these functions are specific to SWI-Prolog. They have been introduced because they shorten the code for returning data to Prolog and at the same time make this more efficient by avoiding the need to allocate temporary term references and reduce the number of calls to the Prolog API. Consider the case where we want a foreign function to return the host name of the machine Prolog is running on. Using the PL_get_*() and PL_put_*() functions, the code becomes:

foreign_t
pl_hostname(term_t name)
{ char buf[100];

  if ( gethostname(buf, sizeof buf) )
  { term_t tmp = PL_new_term_ref();

    PL_put_atom_chars(tmp, buf);
    return PL_unify(name, tmp);
  }

  PL_fail;
}

Using PL_unify_atom_chars(), this becomes:

foreign_t
pl_hostname(term_t name)
{ char buf[100];

  if ( gethostname(buf, sizeof buf) )
    return PL_unify_atom_chars(name, buf);

  PL_fail;
}

Note that unification functions that perform multiple bindings may leave part of the bindings in case of failure. See PL_unify() for details.

bool PL_unify(term_t ?t1, term_t ?t2)

Unify two Prolog terms and return TRUE on success. PL_unify() does not evaluate attributed variables (see section 8.1), it merely schedules the goals associated with the attributes to be executed after the foreign predicate succeeds.^{222Goal
associated with attributes may be non-deterministic, which we cannot
handle from a callback. A callback could also result in deeply nested
mutual recursion between C and Prolog and eventually trigger a C stack
overflow.}

Care is needed if PL_unify() returns FALSE and the foreign function does not immediately return to Prolog with FALSE. Unification may perform multiple changes to either t1 or t2. A failing unification may have created bindings before failure is detected. Already created bindings are not undone. For example, calling PL_unify() on a(X, a) and a(c,b) binds X to c and fails when trying to unify a to b. If control remains in C or if we want to return success to Prolog, we must undo such bindings. In addition, PL_unify() may have failed on an exception, typically a resource (stack) overflow. This can be tested using PL_exception(), passing 0 (zero) for the query-id argument. Foreign functions that encounter an exception must return FALSE to Prolog as soon as possible or call PL_clear_exception() if they wish to ignore the exception. Note that there can only be an exception if PL_unify() returned FALSE.

In some scenarios we need to undo partial unifications. Suppose we have a database that contains Prolog terms and we run a query over this database. We must succeed on the first successful unification. If a unification is not successful, we must stop if there is an exception or undo the partial unification and try again. Suppose our database contains f(a,1) and f(b,2) and our query is f(A,2). This should succeed with A = b, but the first unification binds A to a before failing to unify 1 with 2.

static foreign_t
find_in_db(term_t target)
{ fid_t fid = PL_open_foreign_frame();
  term_t candidate = PL_new_term_ref();

  while(get_from_my_database(candidate))
  { if ( PL_unify(candidate, target) ) /* found */
    { PL_close_foreign_frame(fid);
      return TRUE;
    } else if ( PL_exception(0) )      /* error */
    { PL_close_foreign_frame(fid);
      return FALSE;
    }

    PL_rewind_foreign_frame(fid);      /* try next */
  }
  PL_close_foreign_frame(fid);         /* not found */
  return FALSE;
}

This code is only needed if the foreign predicate does not return immediately to Prolog when PL_unify() fails - there is an implicit frame around the entire predicate, and returning FALSE undoes all bindings when that frame is closed.

bool PL_unify_atom(term_t ?t, atom_t a)

Unify t with the atom a and return non-zero on success.

bool PL_unify_bool(term_t ?t, int a)

Unify t with either false or true, according to whether a is zero or non-zero. If t is instantiated, off and on are also accepted.

bool PL_unify_chars(term_t ?t, int flags, size_t len, const char *chars)

New function to deal with unification of char* with various encodings to a Prolog representation. The flags argument is a bitwise or specifying the Prolog target type and the encoding of chars. A Prolog type is one of PL_ATOM, PL_STRING, PL_CODE_LIST or PL_CHAR_LIST. A representation is one of REP_ISO_LATIN_1, REP_UTF8 or REP_MB. See PL_get_chars() for a definition of the representation types. If len is -1 chars must be zero-terminated and the length is computed from chars using strlen().

If flags includes PL_DIFF_LIST and type is one of PL_CODE_LIST or PL_CHAR_LIST, the text is converted to a difference list. The tail of the difference list is t+1.

bool PL_unify_atom_chars(term_t ?t, const char *chars)

Unify t with an atom created from chars and return non-zero on success.

bool PL_unify_list_chars(term_t ?t, const char *chars)

Unify t with a list of ASCII characters constructed from chars.

bool PL_unify_string_chars(term_t ?t, const char *chars)

Unify t with a Prolog string object created from the zero-terminated string chars. The data will be copied. See also PL_unify_string_nchars().

bool PL_unify_integer(term_t ?t, intptr_t n)

Unify t with a Prolog integer from n.

bool PL_unify_int64(term_t ?t, int64_t n)

Unify t with a Prolog integer from n.

bool PL_unify_uint64(term_t ?t, uint64_t n)

Unify t with a Prolog integer from n. Note that unbounded integer support is required if n does not fit in a signed int64_t. If unbounded integers are not supported a representation_error is raised.

bool PL_unify_float(term_t ?t, double f)

Unify t with a Prolog float from f.

bool PL_unify_pointer(term_t ?t, void *ptr)

Unify t with a Prolog integer describing the pointer. See also PL_put_pointer() and PL_get_pointer().

bool PL_unify_functor(term_t ?t, functor_t f)

If t is a compound term with the given functor, just succeed. If it is unbound, create a term and bind the variable, else fail. Note that this function does not create a term if the argument is already instantiated. If f is a functor with arity 0, t is unified with an atom. See also PL_unify_compound().

bool PL_unify_compound(term_t ?t, functor_t f)

bool PL_unify_list(term_t ?l, term_t -h, term_t -t)

Unify l with a list-cell (./2). If successful, write a reference to the head of the list into h and a reference to the tail of the list into t. This reference to h may be used for subsequent calls to this function. Suppose we want to return a list of atoms from a char **. We could use the example described by PL_cons_list(), followed by a call to PL_unify(), or we can use the code below. If the predicate argument is unbound, the difference is minimal (the code based on PL_cons_list() is probably slightly faster). If the argument is bound, the code below may fail before reaching the end of the word list, but even if the unification succeeds, this code avoids a duplicate (garbage) list and a deep unification.

Note that PL_unify_list() is not used with env but with tail, which is a copy of env. PL_copy_term_ref() creates a copy term_t holding the same Prolog term, i.e., not a copy of the Prolog term. The only thing that is allowed to be done with an argument to a foreign predicate (such as env) is unification; for anything that might over-write the term, you must use a copy created by PL_copy_term_ref(). The name PL_unify_list() is slightly misleading - it unifies the first argument (l but overwrites the second (h) and third (t) arguments.

foreign_t
pl_get_environ(term_t env)
{ term_t tail = PL_copy_term_ref(env);
  term_t item = PL_new_term_ref();
  extern char **environ;

  for(const char **e = environ; *e; e++)
  { if ( !PL_unify_list(tail, item, tail) ||
         !PL_unify_atom_chars(item, *e) )
      PL_fail;
  }

  return PL_unify_nil(tail);
}

In this example, item is initialized outside the loop. This allocates a single new reference to a term, which is used as a temporary inside the loop - there is no need to allocate a new reference each time around the loop because the item term reference can be reused and the call to PL_unify_list() copies a reference to the new list cell's head into the the term referenced by item.

bool PL_unify_nil(term_t ?l)

Unify l with the atom [].

bool PL_unify_arg(int index, term_t ?t, term_t ?a)

Unifies the index-th argument (1-based) of t with a.

bool PL_unify_term(term_t ?t, ...)

Unify t with a (normally) compound term. The remaining arguments are a sequence of a type identifier followed by the required arguments. This predicate is an extension to the Quintus and SICStus foreign interface from which the SWI-Prolog foreign interface has been derived, but has proved to be a powerful and comfortable way to create compound terms from C. Due to the vararg packing/unpacking and the required type-switching this interface is slightly slower than using the primitives. Please note that some bad C compilers have fairly low limits on the number of arguments that may be passed to a function.

Special attention is required when passing numbers. C‘promotes’any integral smaller than int to int. That is, the types char, short and int are all passed as int. In addition, on most 32-bit platforms int and long are the same. Up to version 4.0.5, only PL_INTEGER could be specified, which was taken from the stack as long. Such code fails when passing small integral types on machines where int is smaller than long. It is advised to use PL_SHORT, PL_INT or PL_LONG as appropriate. Similarly, C compilers promote float to double and therefore PL_FLOAT and PL_DOUBLE are synonyms.

The type identifiers are:

PL_VARIABLE none: No op. Used in arguments of PL_FUNCTOR.
PL_BOOL int: Unify the argument with true or false.
PL_ATOM atom_t: Unify the argument with an atom, as in PL_unify_atom().
PL_CHARS const char *: Unify the argument with an atom constructed from the C char *, as in PL_unify_atom_chars().
PL_NCHARS size_t, const char *: Unify the argument with an atom constructed from length and char* as in PL_unify_atom_nchars().
PL_UTF8_CHARS const char *: Create an atom from a UTF-8 string.
PL_UTF8_STRING const char *: Create a packed string object from a UTF-8 string.
PL_MBCHARS const char *: Create an atom from a multi-byte string in the current locale.
PL_MBCODES const char *: Create a list of character codes from a multi-byte string in the current locale.
PL_MBSTRING const char *: Create a packed string object from a multi-byte string in the current locale.
PL_NWCHARS size_t, const wchar_t *: Create an atom from a length and a wide character pointer.
PL_NWCODES size_t, const wchar_t *: Create a list of character codes from a length and a wide character pointer.
PL_NWSTRING size_t, const wchar_t *: Create a packed string object from a length and a wide character pointer.
PL_SHORT short: Unify the argument with an integer, as in PL_unify_integer(). As short is promoted to int, PL_SHORT is a synonym for PL_INT.
PL_INTEGER long: Unify the argument with an integer, as in PL_unify_integer().
PL_INT int: Unify the argument with an integer, as in PL_unify_integer().
PL_LONG long: Unify the argument with an integer, as in PL_unify_integer().
PL_INT64 int64_t: Unify the argument with a 64-bit integer, as in PL_unify_int64().
PL_INTPTR intptr_t: Unify the argument with an integer with the same width as a pointer. On most machines this is the same as PL_LONG. but on 64-bit MS-Windows pointers are 64 bits while longs are only 32 bits.
PL_DOUBLE double: Unify the argument with a float, as in PL_unify_float(). Note that, as the argument is passed using the C vararg conventions, a float must be casted to a double explicitly.
PL_FLOAT double: Unify the argument with a float, as in PL_unify_float().
PL_POINTER void *: Unify the argument with a pointer, as in PL_unify_pointer().
PL_STRING const char *: Unify the argument with a string object, as in PL_unify_string_chars().
PL_TERM term_t: Unify a subterm. Note this may be the return value of a PL_new_term_ref() call to get access to a variable.
PL_FUNCTOR functor_t, ...: Unify the argument with a compound term. This specification should be followed by exactly as many specifications as the number of arguments of the compound term.
PL_FUNCTOR_CHARS const char *name, int arity, ...: Create a functor from the given name and arity and then behave as PL_FUNCTOR.
PL_LIST int length, ...: Create a list of the indicated length. The remaining arguments contain the elements of the list.

For example, to unify an argument with the term language(dutch), the following skeleton may be used:

static functor_t FUNCTOR_language1;

static void
init_constants()
{ FUNCTOR_language1 = PL_new_functor(PL_new_atom("language"),1);
}

foreign_t
pl_get_lang(term_t r)
{ return PL_unify_term(r,
                       PL_FUNCTOR, FUNCTOR_language1,
                           PL_CHARS, "dutch");
}

install_t
install()
{ PL_register_foreign("get_lang", 1, pl_get_lang, 0);
  init_constants();
}

bool PL_chars_to_term(const char *chars, term_t -t)

Parse the string chars and put the resulting Prolog term into t. chars may or may not be closed using a Prolog full-stop (i.e., a dot followed by a blank). Returns FALSE if a syntax error was encountered and TRUE after successful completion. In addition to returning FALSE, the exception-term is returned in t on a syntax error. See also term_to_atom/2.

The following example builds a goal term from a string and calls it.

int
call_chars(const char *goal)
{ fid_t fid = PL_open_foreign_frame();
  term_t g = PL_new_term_ref();
  BOOL rval;

  if ( PL_chars_to_term(goal, g) )
    rval = PL_call(goal, NULL);
  else
    rval = FALSE;

  PL_discard_foreign_frame(fid);
  return rval;
}
  ...
  call_chars("consult(load)");
  ...

PL_chars_to_term() is defined using PL_put_term_from_chars() which can deal with not null-terminated strings as well as strings using different encodings:

int
PL_chars_to_term(const char *s, term_t t)
{ return PL_put_term_from_chars(t, REP_ISO_LATIN_1, (size_t)-1, s);
}

bool PL_wchars_to_term(const pl_wchar_t *chars, term_t -t)

Wide character version of PL_chars_to_term().

char * PL_quote(int chr, const char *string)

Return a quoted version of string. If chr is '\'', the result is a quoted atom. If chr is '"', the result is a string. The result string is stored in the same ring of buffers as described with the BUF_STACK argument of PL_get_chars();

In the current implementation, the string is surrounded by chr and any occurrence of chr is doubled. In the future the behaviour will depend on the character_escapes Prolog flag.

int PL_for_dict(term_t dict, int (*func)(term_t key, term_t value, void *closure), void *closure, int flags)

Iterates over dict, calling func for each item. In each call, key and value are the processed item's key-value pair and the closure argument is passed from the call to PL_for_dict(). If func returns a non-0 value, the iteration stops and PL_for_dict() returns that value; otherwise, all pairs are processed and PL_for_dict() returns 0. If flags contains PL_FOR_DICT_SORTED, the key-value pairs are processed in the standard order of terms; otherwise the processing order is unspecified.