"SfR Fresh" - the SfR Freeware/Shareware Archive

Member "q-7.11/doc/qdoc.texi" of archive q-7.11.tar.gz:


Caution: As a special service "SfR Fresh" has tried to format the requested Texinfo source page into HTML format but that may be not always succeeeded perfectly. Alternatively you can here view or download the uninterpreted Texinfo source code. That can be also achieved for any archive member file by clicking within an archive contents listing on the first character of the file(path) respectively on the according byte size field.

[Top] [Contents] [Index] [ ? ]

The Q Programming Language


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

1. Introduction

Q stands for “equational”, so Q, in a nutshell, is a programming language which lets you “program by equations”. You specify a system of equations which the interpreter uses as “rewrite rules” to reduce expressions to “normal form”. This allows you to formulate your programs in a high-level, concise and declarative style. Q's development started out in the 1990s, motivated by the author's research on term pattern matching and inspired by the pioneering work on equational programming by Michael O'Donnell and others [O'Donnell 1985], as an attempt to show that term rewriting would provide a feasible basis for a practical programming language. I think that this goal has been achieved, and that the present interpreter is efficient and robust enough for practical purposes. Q has been ported to a variety of operating systems, such as BeOS, FreeBSD, Linux, MacOS X, Solaris and Windows. Porting to other modern UNIX/POSIX environments should be a piece of cake. Thus Q can be used on most modern computer systems, and Q scripts written on one system will usually run on all supported platforms.

The Q language supports a rich variety of built-in types, like arbitrary precision integers, floating point numbers (double precision 64 bit), truth values, strings, tuples, lists, streams (a “lazy” list data structure with call-by-need evaluation), lambda functions and files. It also provides primitives for exception handling and multithreaded execution. Q scripts can be broken down into “modules”, each with their separate namespace, and Q gives you full control over which symbols (function, variable and type names) are exported by each module. This makes it easy to organize large scripts in a modular fashion. Q also allows you to interface to “external” modules written in the C programming language, which provides a means to access functions in C and C++ libraries and employ C's higher processing speed for time-critical tasks. Conversely, Q scripts can also be executed from C and C++, which allows Q to be used as an embedded language or term rewriting engine in C/C++ applications.

As a practical programming language, Q comes with “batteries included”. The standard library, which is mostly written in Q itself, provides rational and complex numbers, a lot of useful tuple, list and stream processing functions (including comprehensions), some common container data structures (dictionaries, sets, etc.), and an interface to the PostScript language. It also includes an extensive system interface which offers services such as binary and C-style formatted I/O, BSD socket I/O, process management, POSIX threads, regular expression matching and internationalization features. Additional extension modules provide interfaces to a number of other third party libraries, which turns Q into a practical tool for a variety of application areas.

In difference to other functional languages, Q is entirely based on the notions of rewrite rules, reductions and irreducible expressions (also known as normal forms) pertaining to the term rewriting calculus. A Q “program”, called a script, consists of equations which are treated as rewrite rules and are used to reduce expressions to normal form. The normal form of an expression denotes its value, which can itself be a compound expression. Q has no rigid distinction between “constructor” and “defined” function symbols and it also allows you to evaluate expressions containing “free” variables. Basically, both sides of an equation may involve arbitrary expressions. Therefore Q can also be used as a tool for symbolic expression evaluation.

On the surface, Q looks very much like contemporary functional languages such as Miranda or Haskell. In fact, the syntax of the language has largely been inspired by the first edition of Introduction to Functional Programming by Richard Bird and Philip Wadler. However, Q is an interpreted language with dynamic typing and eager (leftmost-innermost) evaluation, which is more in line with classical functional languages such as Lisp. For the sake of efficiency, Q scripts are first translated into “bytecode” (an intermediate binary format) which is executed on a virtual stack machine. The interpreter automatically optimizes tail recursion, such that “iterative” algorithms can be realized in constant stack space. Besides the built-in I/O operations, the language is free of side-effects; in particular, Q itself does not have mutable variables (although the standard library provides such imperative features for those who need them). Q also provides two novel and (IMHO) interesting features: a notion of special forms which allows to handle both macro-like functions and lazy evaluation in a uniform setting without having to give up the basic eager evaluation strategy; and a notion of type guards which provides a means to cope with hierarchies of abstract data types (similar to the notion of classes with single inheritance in object-oriented languages) in the context of a term rewriting language.

Using Q is supposed to be fairly simple: you throw together some equations, start the interpreter and then type in the expressions you wish to evaluate. All this can be done with a few keystrokes, if you use the Emacs Q mode supplied with the package. A graphical user interface for Windows is also available. Of course, you can also run the Q programming utilities from the command line if you prefer that.

The manual is organized as follows. In Getting Started, we start out with a brief and informal introduction to the main features of the language, and a glimpse of how Q scripts look like. Lexical Matters, describes the lexical part of the Q language. In Scripts and Modules, we discuss how declarations, definitions and equations are put together to form a script, and how to break down larger scripts into a collection of smaller modules which can be managed separately. Declarations, discusses how certain entities like types, variables and function symbols can be declared in a Q script. Expressions, treats the syntax of Q expressions, and describes the built-in operators provided by the Q language. Equations and Expression Evaluation, is about equations and variable definitions, and how they are used in the evaluation process. Types, and Special Forms, describe the facilities provided by the Q language for dealing with abstract data types and deferred evaluation. Built-In Functions, and The Standard Library, discuss the built-in and standard library functions of the Q language. Clib, describes Q's “system module” which provides access to some important functions from the C library. The appendix gives additional information about some aspects of the language and its implementation. Q Language Grammar, contains a summary of the Q language syntax in BNF. Using Q, provides a description of the programming tools included in the Q programming system, C Language Interface, describes Q's interface to the C programming language, and Debugging, is a brief introduction to the symbolic debugger built into the Q interpreter. Finally, Running Scripts in Emacs, discusses how Q scripts can be edited and run using the Emacs editor.

Acknowledgements

This project has become much further reaching and took a lot longer than I anticipated when I started out working on it, and I wouldn't have been able to keep it going without the patience and support of my beloved wife Evelyn and children Sebastian, Janosch, Yannic and Miriam.

Next I have to acknowledge the support and help provided by colleagues at Mainz and Bremen (Germany) and at INRIA Lorraine (France) during the initial phases of this project. In 1992 Dieter Hofbauer invited me to the term rewriting group at INRIA Lorraine to discuss the design of the (then still nascent) Q with his colleagues and friends. During the following years, I also collaborated with Frank Drewes at Bremen (now at Umea, Sweden) who was involved in the initiation of this project, and Klaus Barthelmann at Mainz who provided many comments which helped to improve the early versions of this manual. Frank and Klaus tested various early beta versions of the Q interpreter, and contributed sample scripts as well as substantial parts of the standard Q library.

I'd also like to express my gratitude to the growing community of users of the Q language all over the world, in particular the members of the Q mailing list, for the lively discussions which helped to improve the language and its implementation, and for the contributed Q modules and applications. Special thanks are due to John Cowan for proofreading the manual, for his valuable suggestions concerning design and implementation issues and for his Q-Chicken interface, and to Rob Hubbard for his comprehensive rational number and polynomial modules which are now part of the distribution.

Last but not least, thanks are also due to the developers of ML and Haskell. While these people have not been involved with Q in any way, their work, which has changed the landscape of functional programming, has been a constant source of inspiration for me.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

2. Getting Started

A new programming language is best conveyed through some examples, and therefore it is common practice to start a language description with an introductory chapter which treats the most essential language features in a fairly informal manner. This is also the purpose of the present chapter. We first show how some of the “standard features” of the Q programming language can be employed for using the Q interpreter effectively as a sophisticated kind of “desktop calculator”. The remainder of this section then adresses the question of how you can extend the Q environment by providing your own definitions in Q scripts, and describes the evaluation process and the treatment of runtime errors in the interpreter.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

2.1 Using the Interpreter

To begin with, let us show how the Q interpreter is invoked to evaluate some expressions with the built-in functions provided by the Q language. Having installed the Q programming system, you can simply invoke the interpreter from your shell by typing q. The interpreter will then start up, display its sign-on, and leaves you at its prompt:

 
==>

This indicates that the interpreter is waiting for you to type an expression. Let's start with a simple one:

 
==> 23
23

Sure enough, that's correct. Whenever you type an expression, the interpreter just prints its value. Any integer is a legal value in the Q language, and the common arithmetic operations work on these values as expected. Q integers are “bignums”, i.e., their size is only limited by available memory:

 
==> 16753418726345 * 991726534256718265234
16614809890429729930396098173389730

“Real” numbers (more exactly, their approximation using 64 bit double precision floating point numbers) are provided as well. Let's try these:

 
==> sqrt (16.3805*5)/.05
181.0

The sqrt identifier denotes a built-in function which computes the square root of its argument. (A summary of the built-in functions can be found in Built-In Functions.)

What happens if you mistype an expression?

 
==> sqrt (16.3805*5)/,05
! Syntax error
>>> sqrt (16.3805*5)/,05
                     ^

As you can see, the interpreter not only lets you know that you typed something wrong, but also indicates the position of the error. It is quite easy to go back now and correct the error, using the interpreter's “command history”. With the up and down arrow keys you can cycle through the expressions you have typed before, edit them, and resubmit a line by hitting the carriage return key. Also note that what you typed is stored in a “history file” when you exit the interpreter, which is reloaded next time the interpreter is invoked. A number of other useful keyboard commands are provided, see Readline: (readline)Command Line Editing section `Command Line Editing' in The GNU Readline Library. In particular, you can have the command line editor “complete” function symbols with the <TAB> key. E.g., if you type in sq and press the <TAB> key, you will get the sqrt function.

Before we proceed, a few remarks about the syntax of function applications are in order. The first thing you should note is that in Q, like in most other modern functional languages, function application is simply denoted by juxtaposition:

 
==> sqrt 2
1.4142135623731

Multiple arguments are specified in the same fashion:

 
==> max 5 7
7

Parentheses can be used for grouping expressions as usual. In particular, if a function application appears as an argument in another function application then it must be parenthesized as follows:

 
==> sqrt (sqrt 2)
1.18920711500272

Another important point is that operators are in fact just functions in disguise. You can turn any operator into a prefix function by enclosing it in parentheses. Thus (+) denotes the function which adds its two arguments, and X+1 can also be written as (+) X 1; in fact, the former expression is nothing but “syntactic sugar” for the latter, see Built-In Operators. You can easily verify this in the interpreter:

 
==> (+) X 1
X+1

You can also have partial function applications like (*) 2 which denotes a function which doubles its argument. Moreover, Q supports so-called operator sections which allow you to specify a binary operator with only either its left or right operand. For instance, (1/) denotes the reciprocal and (+1) the “increment by 1” function:

 
==> (+1) 5
6

==> (1/) 3
0.333333333333333

The interpreter maintains a global variable environment, in which you can store arbitrary expression values. This provides a convenient means to define abbreviations for frequently-used expressions and for storing intermediate results. Variable definitions are done using the def command. For instance:

 
==> def X = 16.3805*5

==> X
81.9025

As indicated, variable symbols usually start with a capital letter; an initial lowercase letter indicates a function symbol. This convention is valid throughout the Q language. However, it is possible to explicitly declare an uncapitalized symbol as a variable, using the var command, and then assign values to it, like so:

 
==> var f

==> def f = sqrt

You can also both declare a variable and initialize its value with a single var command:

 
==> var f = sqrt

==> f X/.05
181.0

Multiple variable declarations and definitions can be entered on a single line, using commas to separate different definitions in a def or var command, and you can separate multiple expressions and commands on the same line with semicolons:

 
==> def X = 16.3805*5, f = sqrt; X; f X/.05
81.9025
181.0

You can also bind several variables at once by using an expression pattern as the left-hand side of a variable definition:

 
==> def (f, X) = (sqrt, 16.3805*5); f X/.05
181.0

(Such pattern-matching definitions only work with def; the left-hand side of a var declaration must always be a simple identifier.)

Another useful feature is the built-in “anonymous” variable ‘_’, which is always set to the value of the most recent expression value printed by the interpreter:

 
==> _
181.0

==> 2*_
362.0

Sometimes you would also like the interpreter to “forget” about a definition. This can be done by means of an undef statement:

 
==> undef X; X
X

Besides def, undef and var, the interpreter provides a number of other special commands. The most important command for beginners certainly is the help command, which displays the online manual using the GNU info reader. You can also run this command with a keyword to be searched in the info file. For instance, to find out about all special commands provided by the interpreter, type the following:

 
==> help commands

(Type q when you are done reading the info file.)

Other useful commands are who which prints a list of the user-defined variables, and whos which describes the attributes of a symbol:

 
==> who
f

==> whos f sqrt

f               user-defined variable symbol
                = sqrt

sqrt            builtin function symbol
                sqrt X1

You can save user-defined variables and reload them using the save and load commands:

 
==> save
saving .q_vars

==> def f = 0; f
0

==> load
loading .q_vars

==> f
sqrt

Some statistics about the most recent expression evaluation can be printed with the stats command:

 
==> sum [1..123456]
7620753696

==> stats
0.52 secs, 246916 reductions, 246918 cells

The stats command displays the (cpu) time needed to complete an evaluation, the number of “reductions” (a.k.a. basic evaluation steps) performed by the interpreter, and the number of expression “cells” needed during the course of the computation. This information is often useful for profiling purposes.

Other commands allow you to edit and run a script directly from the interpreter, and to inspect and set various internal parameters of the interpreter; see Command Language.

Of course, the Q interpreter can also carry out computations on non-numeric data. In fact, Q provides a fairly rich collection of built-in data types, and makes it easy to define your own. For instance, strings are denoted by character sequences enclosed in double quotes. Strings can be concatenated with the ‘++’ operator and the length of a string can be determined with ‘#’. Moreover, the character at a given position can be retrieved with the (zero-based) indexing operator ‘!’:

 
==> "abc"++"xyz"; #"abc"; "abc"!1
"abcxyz"
3
"b"

The same operations also apply to lists, which are written as sequences of values enclosed in brackets:

 
==> [a,b,c]++[x,y,z]; #[a,b,c]; [a,b,c]!1
[a,b,c,x,y,z]
3
b

As in Prolog, lists are actually represented as right-recursive constructs of the form ‘[X|Xs]’ where X denotes the head element of the list and Xs its tail, i.e., the list of the remaining elements. This representation allows a list to be traversed and manipulated in an efficient manner.

 
==> def [X|Xs] = [a,b,c]; X; Xs
a
[b,c]

==> [0|Xs]
[0,b,c]

Another useful list operation is “enumeration”, which works with all so-called “enumeration types” (cf. Enumeration Types), as well as characters and numbers. Enumerations are constructed with the enum function, or using the familiar [X..Y] notation which is in fact only “syntactic sugar” for enum:

 
==> enum "a" "k"
["a","b","c","d","e","f","g","h","i","j","k"]

==> ["a".."k"]
["a","b","c","d","e","f","g","h","i","j","k"]

==> [0..9]
[0,1,2,3,4,5,6,7,8,9]

==> [0.1,0.2..1.0]
[0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0]

Tuples are sequences of expressions enclosed in parentheses, which are Q's equivalent of “vectors” or “arrays” in other languages. Like lists, tuples can be concatenated, indexed and enumerated, but they use an internal representation optimized for efficient storage and constant-time indexing:

 
==> (a,b,c)++(x,y,z); #(a,b,c); (a,b,c)!1
(a,b,c,x,y,z)
3
b

==> (0..9)
(0,1,2,3,4,5,6,7,8,9)

Q also offers built-in operations to convert between lists and tuples:

 
==> tuple [a,b,c]
(a,b,c)

==> list _
[a,b,c]

Q provides yet another list-like data structure, namely “lazy” lists a.k.a. streams, which are written using curly braces instead of brackets. In difference to lists, streams only produce their elements “on demand”, when they are needed in the course of a computation. This makes it possible to represent large and even infinite sequences of values in an efficient manner. You can see this somewhat unusual behaviour if you try streams interactively in the interpreter. For instance, let's try to concatenate two streams:

 
==> {a,b,c}++{x,y,z}
{a|{b,c}++{x,y,z}}

You probably noticed that the tail of the resulting stream, {b,c}++{x,y,z}, has not been evaluated yet. But that is all right, because it will be evaluated as soon as we need it:

 
==> _!4
y

This lazy way of doing things becomes especially important when we work with infinite streams. We will have more to say about this in the following section.

It is quite instructive to keep on playing a bit like this, to get a feeling of what the Q interpreter can do. You might also wish to consult Built-In Functions, which discusses the built-in functions, and Using Q, which shows how the interpreter is invoked and what additional capabilities it offers. To exit the interpreter when you are finished, invoke the built-in quit function:

 
==> quit

This function does not return a value, but immediately returns you to the operating system's command shell.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

2.2 Using the Standard Library

The standard library is a collection of useful functions and data structures which are not provided as built-ins, but are implemented by scripts which are mostly written in Q itself. Most of these scripts are included in the “prelude script” prelude.q, which is always loaded by the interpreter, so that the standard library functions are always available. See The Standard Library, for an overview of these operations. You can check which standard library scripts a.k.a. “modules” are actually loaded with the interpreter's modules command. With the standard prelude loaded, this command shows the following:

 
==> modules 
assert          clib*           complex         cond
error           math            prelude         rational
sort            stdlib          stream          string
tuple           typec

As you can see, there are quite a few modules already loaded, and you can use the functions provided by these modules just like any of the built-in operations. The standard library provides a lot of additional functions operating on numbers, strings and lists. For instance, you can take the sum of a list of (integer or floating point) numbers simply as follows:

 
==> sum [1..5]
15

In fact, the library provides a rather general operation, foldl, which iterates a binary function over a list, starting from a given initial value. Using the foldl function, the above example can also be written as follows:

 
==> foldl (+) 0 [1..5]
15

(There also is a foldr function which works analogously, but combines list members from right to left rather than from left to right.)

Generalizing the notion of simple enumerations of numbers like [1..5], the standard library also provides the general-purpose list-generating function while. For instance, we can generate the list of all powers of 2 in the range [1..1000] as follows:

 
==> while (<=1000) (2*) 1
[1,2,4,8,16,32,64,128,256,512]

(Recall from the previous section that (2*) is the doubling function. And (<=1000) is a predicate which checks its argument to be less than or equal to 1000.)

If we want to generate a list of a given size, we can use iter instead. So, for instance, we might compute the value of a finite geometric series as follows:

 
==> sum (iter 4 (/3) 1)
1.48148148148148

The map function allows you to apply a function to every member of a list. For instance, the following expression doubles each value in the list [1..5]:

 
==> map (2*) [1..5]
[2,4,6,8,10]

Lists can also be filtered with a given predicate:

 
==> filter (>=3) [1..5]
[3,4,5]

The scanl operation allows you to compute all the sums of initial segments of a list (or accumulate any other binary operation over a list):

 
==> scanl (+) 0 [1..5]
[0,1,3,6,10,15]

(Like foldl, scanl also has a sibling called scanr which collects results from right to left rather than left to right, starting at the end of the list.)

You can partition a list into an initial part with a given length and the remaining part of the list with take and drop:

 
==> take 3 [1..5]; drop 3 [1..5]
[1,2,3]
[4,5]

The takewhile and dropwhile functions have a similar effect, but partition their input list according to a given predicate:

 
==> takewhile (<=3) [1..5]; dropwhile (<=3) [1..5]
[1,2,3]
[4,5]

Another useful list operation is zip which collects pairs of corresponding elements in two input lists:

 
==> zip [1..5] ["a".."e"]
[(1,"a"),(2,"b"),(3,"c"),(4,"d"),(5,"e")]

The effect of zip can be undone with unzip which returns a pair of lists:

 
==> unzip _
([1,2,3,4,5],["a","b","c","d","e"])

The zipwith function is a generalized version of zip which combines corresponding members from two lists using a given binary function:

 
==> zipwith (*) [1..5] [1..5]
[1,4,9,16,25]

All the operations described above – and many others – are provided by the stdlib.q script. It is instructive to take a look at this script and see how the operations are defined.

In addition, the standard library provides yet another list generating function listof which implements so-called “list comprehensions”. These allow you to describe list values in much the same way as sets are commonly specified in mathematics. For instance, you can generate a list of pairs (I,J) with 1<=J<I<=5 as follows:

 
==> listof (I,J) (I in [1..5], J in [1..I-1])
[(2,1),(3,1),(3,2),(4,1),(4,2),(4,3),(5,1),(5,2),(5,3),(5,4)]

The language provides syntactic sugar for list comprehensions so that the above example can also be written as follows:

 
==> [(I,J) : I in [1..5], J in [1..I-1]]
[(2,1),(3,1),(3,2),(4,1),(4,2),(4,3),(5,1),(5,2),(5,3),(5,4)]

The same kind of construct also works with tuples:

 
==> ((I,J) : I in [1..5], J in [1..I-1])
((2,1),(3,1),(3,2),(4,1),(4,2),(4,3),(5,1),(5,2),(5,3),(5,4))

We also have a random number generator, which is implemented by the built-in random function. Here is how we can generate a list of 5 pseudo random 32 bit integers:

 
==> [random : I in [1..5]]
[1960913167,1769592841,3443410988,2545648850,536988551]

To get random floating point values in the range [0,1] instead, we simply divide the results of random by 0xffffffff:

 
==> map (/0xffffffff) _
[0.456560674928259,0.41201544027124,0.801731596887515,0.592705060400233,
0.125027389993199]

Lists can be sorted using quicksort (this one comes from sort.q):

 
==> qsort (<) _
[0.125027389993199,0.41201544027124,0.456560674928259,0.592705060400233,
0.801731596887515]

As already mentioned, the Q language also provides a “lazy” list data structure called “streams”. Streams are like lists but can actually be infinite because their elements are only produced “on demand”. The standard library includes a module stream.q which implements a lot of useful stream operations. Most list operations carry over to streams accordingly. For instance, if we want to create a geometric series like the one generated with iter above, but we do not know how many elements will be needed in advance, we can employ the stream generation function iterate:

 
==> iterate (/3) 1
{1|iterate (/3) ((/3) 1)}

The {|} stream constructor works in the same way as the [|] list constructor, but is “special” in that it does not evaluate its arguments, i.e., the head and the tail of the stream. (Otherwise the call to iterate would never terminate.)

To get all members of the series we can apply the scanl function:

 
==> def S = scanl (+) 0 _; S
{0|scanl (+) (0+1) (iterate (/3) ((/3) 1))}

Now we can extract any number of initial values of the series using the take operation, and convert the resulting stream to an ordinary list with the list function. For instance, if we are interested in the first five values of the series, we proceed as follows:

 
==> list (take 5 S)
[0,1,1.33333333333333,1.44444444444444,1.48148148148148]

Note that the stream S is really infinite; if we want, we can extract any value in the series:

 
==> S!9999
1.5

Let's see how many iterations are actually required to reach the limit 1.5 with an error of at most 1e-15:

 
==> #takewhile (>1e-15) (map (1.5-) S)
32

This means that the sum of the first 31 series terms is needed to get an accuracy of 15 digits (which is the best that we can hope for with 64 bit floating point values). We can readily verify this using iter:

 
==> sum (iter 31 (/3) 1)
1.5

The standard library also implements stream enumerations and comprehensions which work like the corresponding list and tuple constructs but are evaluated in a lazy fashion, and the interpreter provides syntactic sugar for these. In particular, the notation {X..} or {X,Y..} can be used to specify an infinite arithmetic sequence. For instance, here is another way to define the infinite geometric series from above:

 
==> def S = scanl (+) 0 {1/3^N : N in {0..}}

==> list (take 5 S)
[0,1.0,1.33333333333333,1.44444444444444,1.48148148148148]

It is worth noting here that streams are just one simple example of a lazy data structure, which happen to play an important role in many useful algorithms so that it makes sense to provide them as a builtin. The Q language makes it rather easy to define your own lazy data structures using so-called special forms. In fact, special forms provide a uniform means to define both lazy data constructors and macro-like functions which take their arguments unevaluated, employing “call by name” parameter passing. This will be discussed in detail in Special Forms.

Special forms are an integrated part of Q which adds a great amount of expressive power to the language. In particular, they allow specialized constructs such as a “short-circuit” conditional expressions to be defined in the standard library, just like “ordinary” functions, rather than having to provide them as builtins. One example is the conditional expression if X then Y else Z which is nothing but syntactic sugar for the standard library function ifelse:

 
==> ifelse (5>0) "positive" "negative"
"positive"

==> if 5>0 then "positive" else "negative"
"positive"

Another special form which is useful in many situations is lambda (as of Q 7.1, this one is actually provided as a builtin), which allows you to create little “anonymous” functions on the fly. The customary notation \X.Y is provided as a shorthand for lambda X Y. These constructs are also called lambda abstractions, or simply lambdas. For instance, the following interpreter command defines the well-known factorial function using a lambda abstraction and assigns the resulting function object to the variable fac:

 
==> var fac = \N.if N>0 then N*fac (N-1) else 1

==> fac
\X1 . if X1>0 then X1*fac (X1-1) else 1

==> map fac [1..10]
[1,2,6,24,120,720,5040,40320,362880,3628800]

Lambdas taking multiple arguments can be created like this:

 
==> \X Y.(1-X)*Y
\X1 . \X2 . (1-X1)*X2

==> _ 0.9 0.5
0.05

The lambda arguments can also be patterns to be matched against the actual parameters. For instance:

 
==> \(X,Y).(1-X)*Y
\(X1,X2) . (1-X1)*X2

==> _ (0.9,0.5)
0.05

Besides string and list processing functions and general utility functions of the kind sketched out above, the standard library also includes a collection of efficient “container” data structures which are useful in many applications. For instance, so-called hashed dictionaries (also known as “hashes” or “associative arrays”) are implemented by the hdict.q script. The following expressions show how to create a dictionary object from a list, and how to access this dictionary using the index (!), keys and vals operations. (Note that here we have to load the hdict.q module explicitly, as it is not automatically included via the prelude. We could also import the stdtypes.q module instead, to get hold of all the container types.)

 
==> import hdict

==> def D = hdict [(foo,99),(bar,-1),(gnu,foo)]; D!gnu
foo

==> keys D
[foo,bar,gnu]

==> vals D
[99,-1,foo]

This completes our little tour through the standard library. To find out more, please refer to The Standard Library, or take a look at the scripts themselves.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

2.3 Writing a Script

Now that we have taken the first big hurdle and are confident that we can actually get the Q interpreter to evaluate us some expressions, let us try to define a new function for use in the interpreter. That is, we are going to write our first own Q “program” or “script”.

In the Q language, there are actually two different ways to create new functions. As we have already seen in the previous sections, little “anonymous” functions are often computed on the fly by deriving them from existing functions using partial applications or lambda abstractions. The other way, which is often more convenient when the functions to be defined get more complicated, is the definition of named functions by the means of “equations”, which works in a way similar to algebraic definitions in mathematics. We will discuss how to do this in the following.

Having exited the Q interpreter, invoke your favourite text editor and enter the following simple script which implements the factorial function:

 
fac N                   = N*fac(N-1) if N>0;
                        = 1 otherwise;

(Note that in order to define functions equationally, you really have to write a script file. For technical reasons there currently is no way to enter an equational definition directly in the interpreter.)

Save the script in the file fac.q and then restart the interpreter as follows (here and in the following ‘$’ denotes the command shell prompt):

 
$ q fac.q

You can also edit the script from within the interpreter (using vi or the editor named by the EDITOR environment variable) and then restart the interpreter with the new script, as follows:

 
==> edit fac.q

==> run fac.q

In any case, now the interpreter should know about the definition of fac and we can use it like any of the built-in operations:

 
==> map fac [1..10]
[1,2,6,24,120,720,5040,40320,362880,3628800]

For instance, what is the number of 30-element subsets of a 100-element set?

 
==> fac 100 div (fac 30*fac 70)
29372339821610944823963760

As another example, let us write down Newton's algorithm for computing roots of a function. Type in the following script and save it to the file newton.q:

 
newton DX DY F          = until (satis DY F) (improve DX F);
satis DY F X            = abs (F X) < DY;
improve DX F X          = X - F X / derive DX F X;
derive DX F X           = (F (X+DX) - F X) / DX;

Restart the interpreter with the newton.q script and try the following. Note that the target function here is \Y.Y^3-X which becomes zero when Y equals the cube root of X.

 
==> var eps = .0001, cubrt = \X.newton eps eps (\Y.Y^3-X) X

==> cubrt 8
2.00000000344216

Well, this is not that precise, but we can do better:

 
==> def eps = 1e-12

==> cubrt 8
2.0

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

2.4 Definitions

As mentioned in the previous section, in the Q language function definitions take the form of equations which specify how a given expression pattern, the left-hand side of the equation, is transformed into a new expression, the right-hand side of the equation. In this section we elaborate on this some more to give you an idea about how these equational definitions work. Our explanations will necessarily be a bit sketchy at this point, but the concepts we briefly touch on in this section will be explained in much more detail later.

The left-hand side of an equation may introduce “variables” (denoted by capitalized identifiers, as in Prolog) which play the role of formal parameters in conventional programming languages. For instance, the following equation defines a function sqr which squares its (numeric) argument:

 
sqr X                   = X*X;

An equation may also include a condition part, as in the definition of the factorial function from the previous section:

 
fac N                   = N*fac(N-1) if N>0;
                        = 1 otherwise;

As indicated above, several equations for the same left-hand side can be factored to the left, omitting repetition of the left-hand side. Furthermore, the otherwise keyword may be used to denote the “default” alternative.

The left-hand side of an equation may actually be an arbitrary compound expression pattern to be matched in the expression to be evaluated. For instance, as we have already seen, the expressions [] and [X|Xs] denote, respectively, the empty list and a list starting with head element X and continuing with a list Xs of remaining elements, just as in Prolog. We can use these patterns to define a function add which adds up the members of a list as follows:

 
add []                  = 0;
add [X|Xs]              = X+add Xs;

Thus no explicit operations for “extracting” the members from a compound data object such as a list are required; the components are simply retrieved by “pattern matching”. This works in variable definitions as well. For instance, in the interpreter you can bind variables to the head element and the tail of a list using a single definition as follows:

 
==> def [X|Xs] = [1,2,3]; X; Xs
1
[2,3]

Pattern matching also works if the arguments take the form of function applications. For instance, the following definition implements an insertion operation on binary trees:

 
insert nil Y            = bin Y nil nil;
insert (bin X T1 T2) Y  = bin Y T1 T2 if X=Y;
                        = bin X (insert T1 Y) T2 if X>Y;
                        = bin X T1 (insert T2 Y) if X<Y;

Note the symbols nil and bin which act as “constructors” for the binary tree data structure in this example. (Q actually has no rigid distinction between “function applications” and “data constructors”. This is unnecessary since function applications can be “values” in their own right, as discussed below.)

The Q interpreter treats equations as “rewrite rules” which are used to reduce expressions to “normal forms”. Evaluation normally uses the standard “leftmost-innermost” rule, also known as “applicative order” or “call by value” evaluation. Using this strategy, expressions are evaluated from left to right, innermost expressions first; thus the arguments of a function application (and also the function object itself) are evaluated before the function is applied to its arguments.

An expression is in normal form if it cannot be evaluated any further by applying matching equations (including some predefined rules which are used to implement the built-in operations, see Equations and Expression Evaluation). A normal form expression simply stands for itself, and constitutes a “value” in the Q language. The built-in objects of the Q language (such as numbers and strings) are always in normal form, but also compound objects like lists (given that their elements are in normal form) and any other symbol or function application which does not match a built-in or user-defined equation.

This means that Q is essentially an “exception-free” language. That is, an “error condition” such as “wrong argument type” does not raise an exception by default, but simply causes the offending expression to be returned “as is”. For instance:

 
==> hd []
hd []

==> sin (X+1)
sin (X+1)

You may be disconcerted by the fact that expressions like hd [] and sin (X+1) actually denote legal values in the Q language. However, this is one of Q's key features as a term rewriting programming language, and it adds a great amount of flexibility to the language. Most importantly, it allows expressions, even expressions involving variables, to be evaluated in a symbolic fashion. For instance, you might add rules for algebraic identities and symbolic differentiation to your script and have the interpreter simplify expressions according to these rules. On the other hand, the Q language also allows you to refine the definition of any built-in or user-defined operation and thus it is a simple matter to add an “error rule” like the following to your script when needed:

 
hd []                   = error "hd: empty list";

(The error function is defined in the standard library script error.q, see Diagnostics and Error Messages.) In order to implement more elaborate error handling, you can also use Q's built-in throw and catch functions to raise and respond to exceptions on certain error conditions. This is discussed in Exception Handling.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

2.5 Runtime Errors

There are a few error conditions which may cause the Q interpreter to abort the evaluation with a runtime error message. A runtime error arises when the interpreter runs out of memory or stack space, when the user aborts an evaluation with the interrupt key (usually Ctl-C), and when the condition part of an equation does not evaluate to a truth value (true or false). (All these error conditions can also be handled by the executing script, see Exception Handling.)

You can use the break on command to tell the interpreter that it should fire up the debugger when one of the latter two conditions arises. After Ctl-C, you can then resume evaluation in the debugger, see Debugging. This is not possible if the interpreter stumbles across an invalid condition part; in this case the debugger is invoked merely to report the equation which caused the error, and to inspect the call chain of the offending rule. Hitting the carriage return key will return you to the interpreter's evaluation loop.

For instance, reload the fac.q script from Writing a Script:

 
fac N                   = N*fac(N-1) if N>0;
                        = 1 otherwise;

Now enable the debugger with the break command and type some “garbage” like fac fac:

 
==> break on; fac fac
! Error in conditional
  0>  fac.q, line 1: fac fac  ==>  fac*fac (fac-1) if fac>0
(type ? for help)
:

What happened? Well, the debugger informs us that the error occurred when the interpreter tried to apply the first equation,

 
fac N                   = N*fac(N-1) if N>0;

to the expression fac fac. In fact this is no big surprise since we cannot expect the interpreter to know how to compare a symbol, fac, with a number, 0, which it tried when evaluating the condition part of the above equation. So let us return to the interpreter's prompt by hitting the carriage return key:

 
: <CR>

==>

The interpreter now waits for us to type the next expression. (For a summary of debugger commands please refer to Debugging. You can also type ? or help while in the debugger to obtain a list of available commands.)


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3. Lexical Matters

The vocabulary of the Q programming language consists of notations for identifiers, operators, integers, floating point numbers, character strings, comments, a few reserved words which may not be used as identifiers, and some special punctuation symbols which are used as delimiters.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.1 Whitespace and Comments

Whitespace (blanks, tabs, newlines, form feeds) serves as a delimiter between adjacent symbols, but is otherwise ignored. Comments are treated like whitespace:

 
/* This is a comment ... */

Note that these comments cannot be nested. C++-style line-oriented comments are also supported:

 
// C++-style comment ...

Furthermore, lines beginning with the #! symbol denote a special type of comment which may be processed by the operating system's command shell and the Q programming tools. On UNIX systems, this (odd) feature allows you to execute Q scripts directly from the shell (by specifying the q program as a command language processor) and to include compiler and interpreter command line options in a script (see Running Scripts from the Shell).


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.2 Identifiers and Reserved Words

Identifiers are denoted by the usual sequences of letters (including ‘_’) and digits, beginning with a letter. Upper- and lowercase is distinct. In the Q language, identifiers are used to denote type, function and variable symbols. Type identifiers may start with either an uppercase or lowercase letter (the convention, however, is to use an initial uppercase letter). The case of the first letter matters, however, for function and variable symbols. As in Prolog, a capitalized identifier (such as X, Xmax and XMAX) indicates a variable symbol, whereas identifiers starting with a lowercase letter denote function symbols (unless they are declared as “free” variables, see below). Unlike in Prolog, the underscore ‘_’ counts as a lowercase letter, hence _MAX is a function symbol, not a variable. (The idea behind this is that it allows you to get a function symbol which appears to start with an uppercase letter by stropping it with an initial ‘_’.) However, as an exception to the general rule, the identifier ‘_does denote a variable symbol, the so-called anonymous variable. The same rules also apply to symbols created interactively in the interpreter.

Variables actually come in two flavours: bound and free variables, i.e., variables which also occur on the left-hand side of an equation, and variables which only occur on the right-hand side and/or in the condition part of an equation. Identifiers may also be declared as free variables; see Declarations. In this case, they may also start with a lowercase letter.

Type, function and free variable identifiers may also be qualified with a module identifier prefix (cf. Scripts and Modules), to specifically denote a symbol of the given module. Such a qualified identifier takes the form module-id::identifier; no whitespace or comments are allowed between the module name, ‘::’ symbol and the function or variable identifier.

Formally, the syntax of identifiers is described by the following grammatical rules:

 
identifier              : unqualified-identifier
                        | qualified-identifier
qualified-identifier    : module-identifier '::'
                          unqualified-identifier
unqualified-identifier  : variable-identifier
                        | function-identifier
                        | type-identifier
module-identifier       : letter {letter|digitsym}
type-identifier         : letter {letter|digitsym}
variable-identifier     : uppercase-letter {letter|digitsym}
                        | '_'
function-identifier     : lowercase-letter {letter|digitsym}
letter                  : uppercase-letter|lowercase-letter
uppercase-letter        : 'A'|...|'Z'|uppercase unicode letter
lowercase-letter        : 'a'|...|'z'|'_'|lowercase unicode letter
digitsym                : '0'|...|'9'|unicode digit

(Please refer to Q Language Grammar, for a description of the BNF grammar notation used throughout this document.)

The following symbols are reserved words of the Q language and may not be used as identifiers:

 
as        const     def       else      extern    from      if
import    include   otherwise private   public    special   then
type      undef     var       virtual   where

In addition, the following identifiers are predeclared as operator symbols (see Operator Symbols) and cannot be used as normal identifiers either:

 
and       div       mod       not       or

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.3 Operator Symbols

Operator symbols may either take the form of function identifiers or they may be sequences of punctuation characters (excluding ‘_’ which serves as a letter in the Q language). Like function and free variable symbols, they may be qualified with a module identifier prefix:

 
op                      : unary-op|binary-op
unary-op                : opsym
binary-op               : opsym|'and' 'then'|'or' 'else'
opsym                   : unqualified-opsym
                        | qualified-opsym
qualified-opsym         : module-identifier '::'
                          unqualified-opsym
unqualified-opsym       : function-identifier
                        | punctsym {punctsym}
punctsym                : unicode punctuation character

In either case, operator symbols must be declared explicitly before they can be used, cf. Declarations. The declaration determines the precedence and “fixity” of the operator (i.e., whether it acts as a unary prefix or a binary infix operator; see User-Defined Operators, for more information on this).

As already mentioned, the following identifiers are predefined as built-in operators:

 
and       div       mod       not       or

As indicated by the syntax rules, the logical connectives ‘and’ and ‘or’ can also be combined with the keywords ‘then’ and ‘else’ to form the “short-circuit” connectives ‘and then’ and ‘or else’.

The punctuation symbols predefined as built-in operators are the following:

 
` ' ~ & . || < > = <= >= <> == ++ + - * / ^ ! # $

Most of these can actually be redeclared by the programmer for his own purposes. However, there are a few symbols, called “soft delimiters”, which play a special role in the syntax of the Q language (some of these are also used as operator symbols):

 
~ . .. : | = == - \ @

The soft delimiters may occur inside user-defined operators, but if they are used as separate lexemes then they are treated like reserved words and thus they may not be declared as operator symbols (except for the purpose of making aliases, cf. Declarations). The same applies to the reserved keywords (see Identifiers and Reserved Words).

Moreover, the following special symbols serve as “hard delimiters” in the Q language which always separate lexemes and thus may not occur inside operator symbols at all:

 
" , ; :: ( ) [ ] { }

The same applies to whitespace and other non-printable characters, as well as the comment delimiters (‘//’ and ‘/*’ as well as initial ‘#!’ on a script line).

Symbols consisting of punctuation are generally parsed using the “longest possible lexeme” a.k.a. “maximal munch” rule. Here, the “longest possible lexeme” refers to the longest prefix of the input such that the sequence of punctuation characters either forms a valid (i.e., declared) operator symbol, or one of the reserved and special delimiter symbols listed above. Thus, e.g., ‘..#’ will usually be parsed as ‘.. #’, i.e., a reserved ‘..’ symbol followed by a ‘#’ operator. This holds unless the entire sequence ‘..#’ has already been declared as an operator in the scope where it is used.

The only exception to the “maximal munch” rule occurs inside declarations where a new operator symbol is being introduced. In this case the symbol extends up to the next hard delimiter symbol (usually ‘)’ or ‘;’) or whitespace/non-printable character in the input. For instance, the following Q code snippet declares a new binary operator symbol ‘+~%’ (please refer to Declarations, for an explanation of the declaration syntax):

 
public (+~%) X Y;

Another special case arises with conglomerates of three or more consecutive ‘:’ symbols. In this case the initial ‘::’ is always treated as the designation of a qualified (operator) symbol. (In the unlikely case that you ever need to specify a “guarded variable” of the form X : ::Type with a qualified type in the built-in namespace, the space between the ‘:’ token and the following ‘::’ qualification designator is mandatory.)


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.4 Numbers

Signed decimal numeric constants are denoted by sequences of decimal digits (only the standard digits 0..9 in the ASCII character set may be used here) and may contain a decimal point and/or a scaling factor. Integers can also be denoted in octal or hexadecimal, using the same syntax as in C:

 
number                  : ['-'] unsigned-number
unsigned-number         : '0' octdigitseq
                        | '0x' hexdigitseq
                        | '0X' hexdigitseq
                        | digitseq ['.' [digitseq]] [scalefact]
                        | [digitseq] '.' digitseq [scalefact]
digitseq                : digit {digit}
octdigitseq             : octdigit {octdigit}
hexdigitseq             : hexdigit {hexdigit}
scalefact               : 'E' ['-'] digitseq
                        | 'e' ['-'] digitseq
digit                   : '0'|...|'9'
octdigit                : '0'|...|'7'
hexdigit                : '0'|...|'9'|'a'|...|'f'|'A'|...|'F'

Simple digit sequences without decimal point and scaling factor are treated as integers; if the sequence starts with ‘0’ or ‘0x’/‘0X’ then it denotes an integer in octal or hexadecimal base, respectively. Other numbers denote (decimal) floating point values. If a decimal point is present, it must be preceded or followed by at least one digit. Both the scaling factor and the number itself may be prefixed with a minus sign. (Syntactically, the minus sign in front of a number is interpreted as unary minus, cf. Expressions. However, if unary minus occurs in front of a number, it is interpreted as a part of the number and is taken to denote a negative value. See the remarks concerning unary minus in Expressions.) Some examples:

 
0  -187326  0.0  -.05  3.1415e3 -1E-10 0177 0xaf -0XFFFF

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.5 Strings

String constants are written as character sequences enclosed in double quotes:

 
string                  : '"' {char} '"'
char                    : any character but newline and "

To include newlines, double quotes and other (non-printable) characters in a string, the following escape sequences may be used:

 
\n                      newline
\r                      carriage return
\t                      tab
\b                      backspace
\f                      form feed
\"                      double quote
\\                      backslash

Furthermore, a character may also be denoted in the form \N, where N is the character number in decimal, hexadecimal or octal (using the same syntax as for unsigned integer values). Note that the character number may consist of an arbitrary number of digits; the resulting value will be taken modulo the size of the local character set, which is 256 in the case of ASCII-only systems and 0x110000 if the interpreter has been built with Unicode support (see Unicode Support). Optionally, the character number may also be enclosed in parentheses; this makes it possible to specify a string in which a character escape is immediately followed by another character which happens to be a valid digit, as in "\(123)4".

As of version 7.11 and later, the interpreter also supports symbolic character escapes of the form ‘\&name;’, where name is any of the XML single character entity names specified in the “XML Entity definitions for Characters”, see http://www.w3.org/TR/xml-entity-names/. Note that, at the time of this writing, this is still a W3C working draft, so the supported entity names may be subject to change until the final specification comes out; the currently supported entities are described in the draft from 14 December 2007, see http://www.w3.org/TR/2007/WD-xml-entity-names-20071214/. Also note that multi-character entities are not supported in this implementation.

A string may also be continued across lines by putting the \ character immediately before the end of the line, which causes the following newline character to be ignored.

As of Q 7.0, it is a syntax error if a character escape is not recognized as either a numeric escape or one of the special escapes listed above.

Some examples:

 
""                      empty string
"A"                     single character string
"\27"                   ASCII escape character (ASCII 27)
"\033"                  same in octal
"\0x1b"                 same in hexadecimal
"\(0x1b)c"              ASCII escape followed by a literal ‘c’ character
"Gr\&auml;f"            German umlaut, using XML entity escape
"a string"              multiple character string
"a \"quoted\" string"   include double quotes
"a line\n"              include newline
"a very \               continue across line end
long line\n"

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.6 Unicode Support

Since version 7.0 the Q interpreter has full Unicode support. This means that identifiers may actually contain arbitrary alphanumeric letter and digit symbols from the entire Unicode character set. (All non-uppercase alphabetical letters are considered as lowercase letters which may begin a function identifier.) Likewise, operator symbols may consist of arbitrary punctuation characters in the Unicode character set. (We refrain from giving any concrete examples here, to keep this document 7-bit clean.)

Moreover, all string values are internally represented using the UTF-8 encoding, which allows you to represent all characters in the Unicode character set, while retaining backward compatibility with 7 bit ASCII. String literals in the source script will be translated from the system encoding to UTF-8 automatically and transparently, and the \N notation may specify any valid Unicode point. For this to work, the interpreter must have been built with Unicode support enabled (which is the default).


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4. Scripts and Modules

The basic compilation unit in the Q language is the script, which is simply a (possibly empty) sequence of declarations and definitions in a single source file:

 
script                  : {declaration|definition}

In order to make a given set of definitions available for use in the interpreter, you can just collect the relevant declarations, variable definitions and equations in a script which you submit to the interpreter for execution. The interpreter in turn invokes the compiler to translate the script into a bytecode file which is loaded by the interpreter (see Using Q). A script may also be empty, in which case it does not provide any definitions at all.

If you put all your definitions into a single script, that's all there is to it. However, you will often want to break down a larger script into a collection of smaller units which can be managed separately and which are linked together by the compiler into a single bytecode file. To these ends the Q language allows you to put your definitions into several separate script files, which are also called modules. To gain access to the function, variable and type symbols provided by another module, you must then use an import or include declaration, using the following syntax:

 
declaration             : unqualified-import
                        | qualified-import

unqualified-import      : 'import' module-spec {',' module-spec} ';'
                        | 'include' module-spec {',' module-spec} ';'

qualified-import        : 'from' module-spec 'import' [symbol-specs] ';'
                        | 'from' module-spec 'include' [symbol-specs] ';'

module-spec             : module-name ['as' unqualified-identifier]
module-name             : unqualified-identifier
                        | string

symbol-specs            : symbol-spec {',' symbol-spec}

symbol-spec             : unqualified-identifier ['as' unqualified-identifier]
                        | unqualified-opsym ['as' unqualified-opsym]

As of version 7.8, Q supports both unqualified and qualified import clauses. The former allows you to quickly import an entire collection of modules, while the latter gives you precise control over which symbols are to be imported from which modules. We describe each of these in turn.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.1 Unqualified Imports

Easiest first, let as take a look at unqualified imports. The following declaration imports two modules foo and bar:

 
import foo, bar;

If you put such declarations into your “main script” and then run the script with the interpreter, the imported modules will be loaded as well.

To determine which functions, variables and types are actually available to be imported into another script, the Q language allows you to declare symbols either as “public” or “private”, see Declarations. Outside a module, only the public symbols are accessible when the module is imported in another module.

Normally import is not “transitive”, i.e., importing a module foo does not automatically give you access to the symbols imported by foo, but only to the public symbols declared in foo itself. To work around this, instead of import you can also use an include declaration which causes all imports and includes of the included module to be “reexported”. For instance, let us consider the case that module foo includes module bar:

 
include bar;

Then by importing module foo you also gain access to the public symbols of module bar (and, recursively, to the modules included by bar, etc.). This provides a means to group different modules together in one larger “umbrella” module. For instance, the standard prelude script (cf. The Standard Library) simply includes most of the other standard library modules.

In the Q language, each module has its own separate namespace. This means that two different modules, say foo1 and foo2, may both provide their own public function symbol named foo. If both foo1 and foo2 are imported in the same script, you can distinguish between the two by using a qualified identifier, using the module name as a prefix, e.g., foo1::foo or foo2::foo. (Writing simply foo in this case will produce an error message because the reference is ambiguous.)

Import and include declarations can occur anywhere in a script (and will be active from this point on up to the end of the script file), but it is common (and recommended) practice to put them near the beginning of the script. As in most other programming languages, module dependencies must always be acyclic. If the compiler finds a cyclic chain of import and include declarations, such as a module importing itself or a module foo importing a module bar which in turn imports foo again, it produces an error message.

Another caveat: When using unqualified import clauses, it is much too easy to accidentally “reuse” an imported symbol because of a missing local symbol declaration. For instance, if you import a module foo which happens to export a symbol bar, and then you later define a function bar in your script, your definition will use the imported bar symbol. This is perfectly legal in Q, and may be intended, but if it's not then you have to explicitly declare a new local bar symbol after the import clause, cf. Declarations. If you forget this, you will errorneously redefine the existing bar function instead. One way to avoid this is to always use qualified imports, as described below. Another possibility is to run the interpreter with the --pedantic option, in which case it will warn you about symbols from unqualified imports which are referred to with unqualified identifiers, see Running Compiler and Interpreter, for details.

As indicated in the syntax rules, the names of the modules to be imported can either be specified using an unqualified identifier, or a string denoting a full or relative pathname, e.g.:

 
import "/home/ag/q/stuff.q";

In this case, you can also omit the ‘.q’ suffix, it will be supplied automatically when necessary. Moreover, the module identifier is automatically the basename of the script filename (i.e., stuff in the above example); therefore, it is a good idea to choose basenames which are legal Q identifiers so that you can use them in qualified symbols.

If no absolute path is specified, the interpreter locates script files using its “search path”, which usually contains the current directory and other system-dependent or user-defined locations where “library” scripts are kept, see Using Q, for details.

The ‘as’ keyword can be used to explicitly specify an unambiguous module name. This is useful, in particular, if the basename of the module is not a valid identifier, and for the purpose of resolving name clashes. For instance:

 
import "my/stdlib.q" as mystdlib;

Simply importing "my/stdlib.q" under its own name in this case would produce an error message, because the name of the module collides with the standard library module of the same name, and the compiler enforces that all module names are unique.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.2 Qualified Imports

A qualified import takes the form:

 
from foo import foo, BAR;

This specifically imports the symbols foo and BAR from module foo, and nothing else. What this actually does is to “redeclare” the imported symbols in the namespace of the importing module, as explained in Cross-Checking of Declarations and Aliases, so that they look like private symbols of the importing module. Thus you can refer to them, e.g., simply as foo or with the qualified name gnats::foo, where gnats is the name of the importing module. Note that foo::foo will not work in this case, because a qualified import does not bring the module itself into scope. (However, as pointed out below you can use both a qualified and an unqualified import in concert to make both gnats::foo and foo::foo work.)

A note on terminology: In Q parlance, the terms qualified import and unqualified import apply to the import clauses themselves; a qualified import clause is called “qualified” simply because it restricts the set of imported symbols. Unqualified and qualified identifiers can be used with both styles of import clauses alike.

As with unqualified import, there is a second kind of qualified import clause which also reexports the imported symbols, by making them public members of the importing module:

 
from foo include foo, BAR;

For convenience, you can also write:

 
from module import;

or

 
from module include;

without listing any symbols to just import or include all symbols of the given module qualified. (It goes without saying that this should be used with care.) This is roughly equivalent to ‘import module’ or ‘include module’, respectively, but the imported symbols are made available as members of the namespace of the client, not the imported module.

If this sounds a bit confusing, here is an example which should clarify the differences between unqualified and qualified imports. Let's assume the following three modules A, B and C:

 
/* A: */
import B;

/* B: */
include C;

/* C: */
public foo;

Using this setup, the symbol foo is available as either just ‘foo’ or as ‘C::foo’ in all of A, B and C. Now suppose we change module B to use a qualified ‘include’ instead:

 
/* B: */
from C include foo; /* or just: from C include; */

C's namespace hasn't changed, of course, but in both A and B the symbol foo is now available as ‘foo’ or ‘B::foo’, but not as ‘C::foo’ anymore. However, you can also combine unqualified and qualified imports, like this:

 
/* B: */
include C;
from C include foo;

Now the symbol foo is available as either ‘foo’, ‘B::fooorC::foo’ in B and C (and the compiler will recognize it as the same symbol no matter which notation you use).

Finally, note that since symbols imported using a qualified clause become members of the importing namespace, they must not collide with other symbols declared either locally or in another qualified import clause. Thus, if you have two modules foo1 and foo2 which both export their own (local) symbol foo, the following will provoke an error message from the compiler because of the clash between the two different foo symbols:

 
from foo1 import foo;
from foo2 import foo;

In such cases you must either use unqualified import, or employ an ‘as’ clause to rename the symbols while importing them. E.g.:

 
from foo1 import foo as foo1;
from foo2 import foo as foo2;

Note that this is only a problem if the imported symbols are really distinct. If the symbols are just different “aliases” of the same symbol defined elsewhere, then you can just import them under the same name into the same scope without any problems.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.3 Implicit Imports and the Prelude

Actually, there is yet another kind of import, namely the implicit import of the prelude.q script at the beginning of each script, which in turn includes most standard library modules (see The Standard Library) and makes them readily available in your program, without having to use an explicit import declaration. When looking up an unqualified symbol in a given script, the compiler first searches for a symbol defined in that script, then for a symbol in an imported or included module, then for a symbol defined by the prelude and its includes, and finally for a built-in symbol.

You can instruct the compiler to inhibit the default import of the prelude, by using the --no-prelude option, see Using Q. Moreover, you can also override the standard prelude with your own, if it occurs on the search path before the standard prelude, see Setting up your Environment.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

4.4 The Global Namespace

The namespace available in the interpreter is always that of the main script, i.e., the script given on the command line. This namespace also includes the built-in namespace which contains all the built-in symbols, as well as the implicit imports, i.e., the prelude and all its includes (unless the --no-prelude option was used). The built-in namespace can be accessed explicitly by using an empty module qualifier, as in ::sin. Qualified prelude symbols use the corresponding module name as the qualifier (e.g., stdlib::cat), just as with explicit imports.

The interpreter also allows you to dynamically import additional modules in the global scope using the import command, see Command Language. Moreover, to facilitate testing and debugging, in the interpreter it is possible to gain access to all public and private symbols of the program (also in modules not directly imported in the main script) using qualified identifiers.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

5. Declarations

The Q language allows you to declare function and (free) variable symbols explicitly, by means of the syntactic constructs discussed in this chapter. Symbol declarations are mostly optional; if you introduce a new symbol without declaring it, the compiler declares it for you. However, you will sometimes wish to ensure that a new symbol is created to override a symbol of the same name from an imported module, or you want to attach special attributes to a symbol, and then you have to use an explicit declaration. Explicit declarations are also needed to introduce new operator and type symbols. Moreover, while implicit declarations are often convenient when you want to get something done quickly, for larger projects you might prefer to play it safe and ensure that each function symbol actually has a proper explicit declaration. In such cases you can run the compiler with the --pedantic option (see Running Compiler and Interpreter) to warn you about undeclared occurrences of an identifier.


[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

5.1 Declaration Syntax

Syntactically, symbol declarations take the following form:

 
declaration             : prefix headers ';'
                        | [scope] 'type' unqualified-identifier
                          [':' identifier] ['=' sections] ';'
                        | [scope] 'extern' 'type' unqualified-identifier
                          [':' identifier] ['=' sections] ';'
                        | [scope] 'type' qualified-identifier
                          ['as' unqualified-identifier] ';'
                        | [scope] 'type' unqualified-identifier
                          '==' identifier ';'

prefix                  : scope
                        | [scope] modifier {modifier}

scope                   : 'private'|'public'

modifier                : 'const'|'special'|'extern'|'var'|'virtual'

headers                 : header {',' header}

header                  : unqualified-identifier '=' expression
                        | unqualified-identifier
                          {['~'] variable-identifier}
                        | qualified-identifier
                          {['~'] variable-identifier}
                          ['as' unqualified-identifier]
                        | '(' unqualified-opsym ')'
                          {['~'] variable-identifier}
                          ['@' precedence]
                        | '(' qualified-opsym ')'
                          {['~'] variable-identifier}
                          ['@' precedence]
                          ['as' unqualified-opsym]

precedence              : unsigned-number
                        | '(' operator ')'

sections                : section {'|' section}

section                 : [prefix] headers

For instance, the following are all valid symbol declarations:

 
public foo;
private extern bar X;
public special lambda X Y;
special cond::ifelse ~P X Y as myifelse;

public (--) Xs Ys;
private (::++) Xs Ys as concat;

const red, green, blue;
var FOO, BAR;
var const e = exp 1;

type Day = const sun, mon, tue, wed, thu, fri, sat;
public type BinTree = virtual bintree Xs | private const nil, bin X T1 T2;

The keywords private and public specify the scope of a symbol. Public symbols are accessible outside a script, and can be imported by other modules (see Scripts and Modules). If the keywords private and public are omitted, the scope defaults to private.

The special keyword serves to declare special forms, which are described in Special Forms. Function symbols declared with const introduce “constant” or “constructor” symbols; the compiler enforces that expressions created with such symbols are not redefined in an equation (cf. Equations). The built-in constants true, false, [] and () are already predeclared as const. The virtual keyword serves to declare “virtual constructors” which can be used to define concrete representations (so-called “views”) of abstract data types, see Views. Non-const function symbols can also be declared as extern, meaning that the corresponding function is actually implemented by a corresponding “external” module, see C Language Interface.

The var keyword allows you to declare “free” variable symbols, which can be assigned a value by means of a variable definition, see Free Variables. If you use such a declaration, the variable symbol may also start with a lowercase letter (if it has not already been declared as a function symbol); note that without such a declaration, an identifier starting with a lowercase letter will implicitly be declared as a function symbol. (The idea behind this is that you can make a variable look like a function symbol, if the variable is actually supposed to be used for function values.)

Variable symbols can also be declared as const; in this case the variable can only be assigned to once and always refers to the same value once it has been defined. The built-in variables INPUT, OUTPUT, ERROR and ARGS are predeclared as const, see Command Language. A variable declaration may also contain an “initializer” of the form = X, where X is an expression. This has the effect of declaring and defining the variable in a single step.

As indicated, the scope-modifier prefix of a declaration is followed by a comma-separated list of headers. The first identifier in each header states the symbol to be declared. In function symbol declarations the function identifier may be followed by a list of variable identifiers which are used to denote the arguments of the declared function symbol. The variables are effectively treated as comments; only their number (called the arity of a function symbol) is recorded to check the consistency of different declarations of the same symbol, and to specify the number of “special” arguments in a special declaration. In special declarations variables may also be prefixed with ‘~’ to declare them as “non-special” arguments; see Special Forms.

The first declaration of an unqualified identifier or operator symbol in a module always introduces a new symbol in the module's namespace. By default (if no explicit declaration precedes the first use of a new, unqualified symbol), the compiler automatically declares it as a private nullary function or variable symbol, depending on the case of the initial letter of the identifier (as already mentioned in Lexical Matters, capitalized identifiers are interpreted as variable symbols, others as function symbols).


[ < ] [ > ]   [ << ] [ Up ]