This is ../../doc/qdoc.info, produced by makeinfo version 4.8 from ../../doc/qdoc.texi. INFO-DIR-SECTION Programming Languages START-INFO-DIR-ENTRY * Q: (qdoc). The Q programming language and system. END-INFO-DIR-ENTRY The Q Programming Language, Version 7.11, 23 February 2008. Copyright (c) 1992-2008 by Albert Graef. This manual has been published in the series `Musikinformatik und Medientechnik', Bereich Musikinformatik, Musikwissenschaftliches Institut, Johannes Gutenberg-Universitaet Mainz 55099 Mainz Germany ISSN 0941-0309 The Q programming system is distributed under the terms of the GNU General Public License. See the software license accompanying the distribution for details. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the author.  File: qdoc.info, Node: Top, Next: Introduction, Prev: (DIR), Up: (DIR) This document describes the Q programming language and system, version 7.11, 23 February 2008. Written by Albert Graef, University of Mainz. * Menu: * Introduction:: * Getting Started:: * Lexical Matters:: * Scripts and Modules:: * Declarations:: * Expressions:: * Equations and Expression Evaluation:: * Types:: * Special Forms:: * Built-In Functions:: * The Standard Library:: * Clib:: Appendix: * Q Language Grammar:: * Using Q:: * C Language Interface:: * Debugging:: * Running Scripts in Emacs:: * References:: * Index::  File: qdoc.info, Node: Introduction, Next: Getting Started, Prev: Top, Up: Top 1 Introduction ************** Q stands for "equational", so Q, in a nutshell, is a programming language which lets you "program by equations". You specify a system of equations which the interpreter uses as "rewrite rules" to reduce expressions to "normal form". This allows you to formulate your programs in a high-level, concise and declarative style. Q's development started out in the 1990s, motivated by the author's research on term pattern matching [Graef 1991] and inspired by the pioneering work on equational programming by Michael O'Donnell and others [O'Donnell 1985], as an attempt to show that term rewriting would provide a feasible basis for a practical programming language. I think that this goal has been achieved, and that the present interpreter is efficient and robust enough for practical purposes. Q has been ported to a variety of operating systems, such as BeOS, FreeBSD, Linux, MacOS X, Solaris and Windows. Porting to other modern UNIX/POSIX environments should be a piece of cake. Thus Q can be used on most modern computer systems, and Q scripts written on one system will usually run on all supported platforms. The Q language supports a rich variety of built-in types, like arbitrary precision integers, floating point numbers (double precision 64 bit), truth values, strings, tuples, lists, streams (a "lazy" list data structure with call-by-need evaluation), lambda functions and files. It also provides primitives for exception handling and multithreaded execution. Q scripts can be broken down into "modules", each with their separate namespace, and Q gives you full control over which symbols (function, variable and type names) are exported by each module. This makes it easy to organize large scripts in a modular fashion. Q also allows you to interface to "external" modules written in the C programming language, which provides a means to access functions in C and C++ libraries and employ C's higher processing speed for time-critical tasks. Conversely, Q scripts can also be executed from C and C++, which allows Q to be used as an embedded language or term rewriting engine in C/C++ applications. As a practical programming language, Q comes with "batteries included". The standard library, which is mostly written in Q itself, provides rational and complex numbers, a lot of useful tuple, list and stream processing functions (including comprehensions), some common container data structures (dictionaries, sets, etc.), and an interface to the PostScript language. It also includes an extensive system interface which offers services such as binary and C-style formatted I/O, BSD socket I/O, process management, POSIX threads, regular expression matching and internationalization features. Additional extension modules provide interfaces to a number of other third party libraries, which turns Q into a practical tool for a variety of application areas. In difference to other functional languages, Q is entirely based on the notions of "rewrite rules", "reductions" and "irreducible expressions" (also known as "normal forms") pertaining to the "term rewriting" calculus. A Q "program", called a "script", consists of equations which are treated as rewrite rules and are used to reduce expressions to normal form. The normal form of an expression denotes its value, which can itself be a compound expression. Q has no rigid distinction between "constructor" and "defined" function symbols and it also allows you to evaluate expressions containing "free" variables. Basically, both sides of an equation may involve arbitrary expressions. Therefore Q can also be used as a tool for symbolic expression evaluation. On the surface, Q looks very much like contemporary functional languages such as Miranda or Haskell. In fact, the syntax of the language has largely been inspired by the first edition of `Introduction to Functional Programming' by Richard Bird and Philip Wadler. However, Q is an interpreted language with dynamic typing and eager (leftmost-innermost) evaluation, which is more in line with classical functional languages such as Lisp. For the sake of efficiency, Q scripts are first translated into "bytecode" (an intermediate binary format) which is executed on a virtual stack machine. The interpreter automatically optimizes tail recursion, such that "iterative" algorithms can be realized in constant stack space. Besides the built-in I/O operations, the language is free of side-effects; in particular, Q itself does not have mutable variables (although the standard library provides such imperative features for those who need them). Q also provides two novel and (IMHO) interesting features: a notion of "special forms" which allows to handle both macro-like functions and lazy evaluation in a uniform setting without having to give up the basic eager evaluation strategy; and a notion of "type guards" which provides a means to cope with hierarchies of abstract data types (similar to the notion of classes with single inheritance in object-oriented languages) in the context of a term rewriting language. Using Q is supposed to be fairly simple: you throw together some equations, start the interpreter and then type in the expressions you wish to evaluate. All this can be done with a few keystrokes, if you use the Emacs Q mode supplied with the package. A graphical user interface for Windows is also available. Of course, you can also run the Q programming utilities from the command line if you prefer that. The manual is organized as follows. In *Note Getting Started::, we start out with a brief and informal introduction to the main features of the language, and a glimpse of how Q scripts look like. *Note Lexical Matters::, describes the lexical part of the Q language. In *Note Scripts and Modules::, we discuss how declarations, definitions and equations are put together to form a script, and how to break down larger scripts into a collection of smaller modules which can be managed separately. *Note Declarations::, discusses how certain entities like types, variables and function symbols can be declared in a Q script. *Note Expressions::, treats the syntax of Q expressions, and describes the built-in operators provided by the Q language. *Note Equations and Expression Evaluation::, is about equations and variable definitions, and how they are used in the evaluation process. *Note Types::, and *Note Special Forms::, describe the facilities provided by the Q language for dealing with abstract data types and deferred evaluation. *Note Built-In Functions::, and *Note The Standard Library::, discuss the built-in and standard library functions of the Q language. *Note Clib::, describes Q's "system module" which provides access to some important functions from the C library. The appendix gives additional information about some aspects of the language and its implementation. *Note Q Language Grammar::, contains a summary of the Q language syntax in BNF. *Note Using Q::, provides a description of the programming tools included in the Q programming system, *Note C Language Interface::, describes Q's interface to the C programming language, and *Note Debugging::, is a brief introduction to the symbolic debugger built into the Q interpreter. Finally, *Note Running Scripts in Emacs::, discusses how Q scripts can be edited and run using the Emacs editor. Acknowledgements This project has become much further reaching and took a lot longer than I anticipated when I started out working on it, and I wouldn't have been able to keep it going without the patience and support of my beloved wife Evelyn and children Sebastian, Janosch, Yannic and Miriam. Next I have to acknowledge the support and help provided by colleagues at Mainz and Bremen (Germany) and at INRIA Lorraine (France) during the initial phases of this project. In 1992 Dieter Hofbauer invited me to the term rewriting group at INRIA Lorraine to discuss the design of the (then still nascent) Q with his colleagues and friends. During the following years, I also collaborated with Frank Drewes at Bremen (now at Umea, Sweden) who was involved in the initiation of this project, and Klaus Barthelmann at Mainz who provided many comments which helped to improve the early versions of this manual. Frank and Klaus tested various early beta versions of the Q interpreter, and contributed sample scripts as well as substantial parts of the standard Q library. I'd also like to express my gratitude to the growing community of users of the Q language all over the world, in particular the members of the Q mailing list, for the lively discussions which helped to improve the language and its implementation, and for the contributed Q modules and applications. Special thanks are due to John Cowan for proofreading the manual, for his valuable suggestions concerning design and implementation issues and for his Q-Chicken interface, and to Rob Hubbard for his comprehensive rational number and polynomial modules which are now part of the distribution. Last but not least, thanks are also due to the developers of ML and Haskell. While these people have not been involved with Q in any way, their work, which has changed the landscape of functional programming, has been a constant source of inspiration for me.  File: qdoc.info, Node: Getting Started, Next: Lexical Matters, Prev: Introduction, Up: Top 2 Getting Started ***************** A new programming language is best conveyed through some examples, and therefore it is common practice to start a language description with an introductory chapter which treats the most essential language features in a fairly informal manner. This is also the purpose of the present chapter. We first show how some of the "standard features" of the Q programming language can be employed for using the Q interpreter effectively as a sophisticated kind of "desktop calculator". The remainder of this section then adresses the question of how you can extend the Q environment by providing your own definitions in Q scripts, and describes the evaluation process and the treatment of runtime errors in the interpreter. * Menu: * Using the Interpreter:: * Using the Standard Library:: * Writing a Script:: * Definitions:: * Runtime Errors::  File: qdoc.info, Node: Using the Interpreter, Next: Using the Standard Library, Prev: Getting Started, Up: Getting Started 2.1 Using the Interpreter ========================= To begin with, let us show how the Q interpreter is invoked to evaluate some expressions with the built-in functions provided by the Q language. Having installed the Q programming system, you can simply invoke the interpreter from your shell by typing `q'. The interpreter will then start up, display its sign-on, and leaves you at its prompt: ==> This indicates that the interpreter is waiting for you to type an expression. Let's start with a simple one: ==> 23 23 Sure enough, that's correct. Whenever you type an expression, the interpreter just prints its value. Any integer is a legal value in the Q language, and the common arithmetic operations work on these values as expected. Q integers are "bignums", i.e., their size is only limited by available memory: ==> 16753418726345 * 991726534256718265234 16614809890429729930396098173389730 "Real" numbers (more exactly, their approximation using 64 bit double precision floating point numbers) are provided as well. Let's try these: ==> sqrt (16.3805*5)/.05 181.0 The `sqrt' identifier denotes a built-in function which computes the square root of its argument. (A summary of the built-in functions can be found in *Note Built-In Functions::.) What happens if you mistype an expression? ==> sqrt (16.3805*5)/,05 ! Syntax error >>> sqrt (16.3805*5)/,05 ^ As you can see, the interpreter not only lets you know that you typed something wrong, but also indicates the position of the error. It is quite easy to go back now and correct the error, using the interpreter's "command history". With the up and down arrow keys you can cycle through the expressions you have typed before, edit them, and resubmit a line by hitting the carriage return key. Also note that what you typed is stored in a "history file" when you exit the interpreter, which is reloaded next time the interpreter is invoked. A number of other useful keyboard commands are provided, see *Note Readline: (readline)Command Line Editing. In particular, you can have the command line editor "complete" function symbols with the `' key. E.g., if you type in `sq' and press the `' key, you will get the `sqrt' function. Before we proceed, a few remarks about the syntax of function applications are in order. The first thing you should note is that in Q, like in most other modern functional languages, function application is simply denoted by juxtaposition: ==> sqrt 2 1.4142135623731 Multiple arguments are specified in the same fashion: ==> max 5 7 7 Parentheses can be used for grouping expressions as usual. In particular, if a function application appears as an argument in another function application then it must be parenthesized as follows: ==> sqrt (sqrt 2) 1.18920711500272 Another important point is that operators are in fact just functions in disguise. You can turn any operator into a prefix function by enclosing it in parentheses. Thus `(+)' denotes the function which adds its two arguments, and `X+1' can also be written as `(+) X 1'; in fact, the former expression is nothing but "syntactic sugar" for the latter, see *Note Built-In Operators::. You can easily verify this in the interpreter: ==> (+) X 1 X+1 You can also have partial function applications like `(*) 2' which denotes a function which doubles its argument. Moreover, Q supports so-called "operator sections" which allow you to specify a binary operator with only either its left or right operand. For instance, `(1/)' denotes the reciprocal and `(+1)' the "increment by 1" function: ==> (+1) 5 6 ==> (1/) 3 0.333333333333333 The interpreter maintains a global variable environment, in which you can store arbitrary expression values. This provides a convenient means to define abbreviations for frequently-used expressions and for storing intermediate results. Variable definitions are done using the `def' command. For instance: ==> def X = 16.3805*5 ==> X 81.9025 As indicated, variable symbols usually start with a capital letter; an initial lowercase letter indicates a function symbol. This convention is valid throughout the Q language. However, it is possible to explicitly declare an uncapitalized symbol as a variable, using the `var' command, and then assign values to it, like so: ==> var f ==> def f = sqrt You can also both declare a variable and initialize its value with a single `var' command: ==> var f = sqrt ==> f X/.05 181.0 Multiple variable declarations and definitions can be entered on a single line, using commas to separate different definitions in a `def' or `var' command, and you can separate multiple expressions and commands on the same line with semicolons: ==> def X = 16.3805*5, f = sqrt; X; f X/.05 81.9025 181.0 You can also bind several variables at once by using an expression "pattern" as the left-hand side of a variable definition: ==> def (f, X) = (sqrt, 16.3805*5); f X/.05 181.0 (Such pattern-matching definitions only work with `def'; the left-hand side of a `var' declaration must always be a simple identifier.) Another useful feature is the built-in "anonymous" variable `_', which is always set to the value of the most recent expression value printed by the interpreter: ==> _ 181.0 ==> 2*_ 362.0 Sometimes you would also like the interpreter to "forget" about a definition. This can be done by means of an `undef' statement: ==> undef X; X X Besides `def', `undef' and `var', the interpreter provides a number of other special commands. The most important command for beginners certainly is the `help' command, which displays the online manual using the GNU info reader. You can also run this command with a keyword to be searched in the info file. For instance, to find out about all special commands provided by the interpreter, type the following: ==> help commands (Type `q' when you are done reading the info file.) Other useful commands are `who' which prints a list of the user-defined variables, and `whos' which describes the attributes of a symbol: ==> who f ==> whos f sqrt f user-defined variable symbol = sqrt sqrt builtin function symbol sqrt X1 You can save user-defined variables and reload them using the `save' and `load' commands: ==> save saving .q_vars ==> def f = 0; f 0 ==> load loading .q_vars ==> f sqrt Some statistics about the most recent expression evaluation can be printed with the `stats' command: ==> sum [1..123456] 7620753696 ==> stats 0.52 secs, 246916 reductions, 246918 cells The `stats' command displays the (cpu) time needed to complete an evaluation, the number of "reductions" (a.k.a. basic evaluation steps) performed by the interpreter, and the number of expression "cells" needed during the course of the computation. This information is often useful for profiling purposes. Other commands allow you to edit and run a script directly from the interpreter, and to inspect and set various internal parameters of the interpreter; see *Note Command Language::. Of course, the Q interpreter can also carry out computations on non-numeric data. In fact, Q provides a fairly rich collection of built-in data types, and makes it easy to define your own. For instance, "strings" are denoted by character sequences enclosed in double quotes. Strings can be concatenated with the `++' operator and the length of a string can be determined with `#'. Moreover, the character at a given position can be retrieved with the (zero-based) indexing operator `!': ==> "abc"++"xyz"; #"abc"; "abc"!1 "abcxyz" 3 "b" The same operations also apply to "lists", which are written as sequences of values enclosed in brackets: ==> [a,b,c]++[x,y,z]; #[a,b,c]; [a,b,c]!1 [a,b,c,x,y,z] 3 b As in Prolog, lists are actually represented as right-recursive constructs of the form `[X|Xs]' where `X' denotes the head element of the list and `Xs' its tail, i.e., the list of the remaining elements. This representation allows a list to be traversed and manipulated in an efficient manner. ==> def [X|Xs] = [a,b,c]; X; Xs a [b,c] ==> [0|Xs] [0,b,c] Another useful list operation is "enumeration", which works with all so-called "enumeration types" (cf. *Note Enumeration Types::), as well as characters and numbers. Enumerations are constructed with the `enum' function, or using the familiar `[X..Y]' notation which is in fact only "syntactic sugar" for `enum': ==> enum "a" "k" ["a","b","c","d","e","f","g","h","i","j","k"] ==> ["a".."k"] ["a","b","c","d","e","f","g","h","i","j","k"] ==> [0..9] [0,1,2,3,4,5,6,7,8,9] ==> [0.1,0.2..1.0] [0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0] "Tuples" are sequences of expressions enclosed in parentheses, which are Q's equivalent of "vectors" or "arrays" in other languages. Like lists, tuples can be concatenated, indexed and enumerated, but they use an internal representation optimized for efficient storage and constant-time indexing: ==> (a,b,c)++(x,y,z); #(a,b,c); (a,b,c)!1 (a,b,c,x,y,z) 3 b ==> (0..9) (0,1,2,3,4,5,6,7,8,9) Q also offers built-in operations to convert between lists and tuples: ==> tuple [a,b,c] (a,b,c) ==> list _ [a,b,c] Q provides yet another list-like data structure, namely "lazy" lists a.k.a. "streams", which are written using curly braces instead of brackets. In difference to lists, streams only produce their elements "on demand", when they are needed in the course of a computation. This makes it possible to represent large and even infinite sequences of values in an efficient manner. You can see this somewhat unusual behaviour if you try streams interactively in the interpreter. For instance, let's try to concatenate two streams: ==> {a,b,c}++{x,y,z} {a|{b,c}++{x,y,z}} You probably noticed that the tail of the resulting stream, `{b,c}++{x,y,z}', has not been evaluated yet. But that is all right, because it _will_ be evaluated as soon as we need it: ==> _!4 y This lazy way of doing things becomes especially important when we work with infinite streams. We will have more to say about this in the following section. It is quite instructive to keep on playing a bit like this, to get a feeling of what the Q interpreter can do. You might also wish to consult *Note Built-In Functions::, which discusses the built-in functions, and *Note Using Q::, which shows how the interpreter is invoked and what additional capabilities it offers. To exit the interpreter when you are finished, invoke the built-in `quit' function: ==> quit This function does not return a value, but immediately returns you to the operating system's command shell.  File: qdoc.info, Node: Using the Standard Library, Next: Writing a Script, Prev: Using the Interpreter, Up: Getting Started 2.2 Using the Standard Library ============================== The standard library is a collection of useful functions and data structures which are _not_ provided as built-ins, but are implemented by scripts which are mostly written in Q itself. Most of these scripts are included in the "prelude script" `prelude.q', which is always loaded by the interpreter, so that the standard library functions are always available. See *Note The Standard Library::, for an overview of these operations. You can check which standard library scripts a.k.a. "modules" are actually loaded with the interpreter's `modules' command. With the standard prelude loaded, this command shows the following: ==> modules assert clib* complex cond error math prelude rational sort stdlib stream string tuple typec As you can see, there are quite a few modules already loaded, and you can use the functions provided by these modules just like any of the built-in operations. The standard library provides a lot of additional functions operating on numbers, strings and lists. For instance, you can take the sum of a list of (integer or floating point) numbers simply as follows: ==> sum [1..5] 15 In fact, the library provides a rather general operation, `foldl', which iterates a binary function over a list, starting from a given initial value. Using the `foldl' function, the above example can also be written as follows: ==> foldl (+) 0 [1..5] 15 (There also is a `foldr' function which works analogously, but combines list members from right to left rather than from left to right.) Generalizing the notion of simple enumerations of numbers like `[1..5]', the standard library also provides the general-purpose list-generating function `while'. For instance, we can generate the list of all powers of 2 in the range `[1..1000]' as follows: ==> while (<=1000) (2*) 1 [1,2,4,8,16,32,64,128,256,512] (Recall from the previous section that `(2*)' is the doubling function. And `(<=1000)' is a predicate which checks its argument to be less than or equal to `1000'.) If we want to generate a list of a given size, we can use `iter' instead. So, for instance, we might compute the value of a finite geometric series as follows: ==> sum (iter 4 (/3) 1) 1.48148148148148 The `map' function allows you to apply a function to every member of a list. For instance, the following expression doubles each value in the list `[1..5]': ==> map (2*) [1..5] [2,4,6,8,10] Lists can also be filtered with a given predicate: ==> filter (>=3) [1..5] [3,4,5] The `scanl' operation allows you to compute all the sums of initial segments of a list (or accumulate any other binary operation over a list): ==> scanl (+) 0 [1..5] [0,1,3,6,10,15] (Like `foldl', `scanl' also has a sibling called `scanr' which collects results from right to left rather than left to right, starting at the end of the list.) You can partition a list into an initial part with a given length and the remaining part of the list with `take' and `drop': ==> take 3 [1..5]; drop 3 [1..5] [1,2,3] [4,5] The `takewhile' and `dropwhile' functions have a similar effect, but partition their input list according to a given predicate: ==> takewhile (<=3) [1..5]; dropwhile (<=3) [1..5] [1,2,3] [4,5] Another useful list operation is `zip' which collects pairs of corresponding elements in two input lists: ==> zip [1..5] ["a".."e"] [(1,"a"),(2,"b"),(3,"c"),(4,"d"),(5,"e")] The effect of `zip' can be undone with `unzip' which returns a pair of lists: ==> unzip _ ([1,2,3,4,5],["a","b","c","d","e"]) The `zipwith' function is a generalized version of `zip' which combines corresponding members from two lists using a given binary function: ==> zipwith (*) [1..5] [1..5] [1,4,9,16,25] All the operations described above - and many others - are provided by the `stdlib.q' script. It is instructive to take a look at this script and see how the operations are defined. In addition, the standard library provides yet another list generating function `listof' which implements so-called "list comprehensions". These allow you to describe list values in much the same way as sets are commonly specified in mathematics. For instance, you can generate a list of pairs `(I,J)' with `1<=J listof (I,J) (I in [1..5], J in [1..I-1]) [(2,1),(3,1),(3,2),(4,1),(4,2),(4,3),(5,1),(5,2),(5,3),(5,4)] The language provides syntactic sugar for list comprehensions so that the above example can also be written as follows: ==> [(I,J) : I in [1..5], J in [1..I-1]] [(2,1),(3,1),(3,2),(4,1),(4,2),(4,3),(5,1),(5,2),(5,3),(5,4)] The same kind of construct also works with tuples: ==> ((I,J) : I in [1..5], J in [1..I-1]) ((2,1),(3,1),(3,2),(4,1),(4,2),(4,3),(5,1),(5,2),(5,3),(5,4)) We also have a random number generator, which is implemented by the built-in `random' function. Here is how we can generate a list of 5 pseudo random 32 bit integers: ==> [random : I in [1..5]] [1960913167,1769592841,3443410988,2545648850,536988551] To get random floating point values in the range [0,1] instead, we simply divide the results of `random' by `0xffffffff': ==> map (/0xffffffff) _ [0.456560674928259,0.41201544027124,0.801731596887515,0.592705060400233, 0.125027389993199] Lists can be sorted using quicksort (this one comes from `sort.q'): ==> qsort (<) _ [0.125027389993199,0.41201544027124,0.456560674928259,0.592705060400233, 0.801731596887515] As already mentioned, the Q language also provides a "lazy" list data structure called "streams". Streams are like lists but can actually be infinite because their elements are only produced "on demand". The standard library includes a module `stream.q' which implements a lot of useful stream operations. Most list operations carry over to streams accordingly. For instance, if we want to create a geometric series like the one generated with `iter' above, but we do not know how many elements will be needed in advance, we can employ the stream generation function `iterate': ==> iterate (/3) 1 {1|iterate (/3) ((/3) 1)} The `{|}' stream constructor works in the same way as the `[|]' list constructor, but is "special" in that it does _not_ evaluate its arguments, i.e., the head and the tail of the stream. (Otherwise the call to `iterate' would never terminate.) To get all members of the series we can apply the `scanl' function: ==> def S = scanl (+) 0 _; S {0|scanl (+) (0+1) (iterate (/3) ((/3) 1))} Now we can extract any number of initial values of the series using the `take' operation, and convert the resulting stream to an ordinary list with the `list' function. For instance, if we are interested in the first five values of the series, we proceed as follows: ==> list (take 5 S) [0,1,1.33333333333333,1.44444444444444,1.48148148148148] Note that the stream `S' is really infinite; if we want, we can extract _any_ value in the series: ==> S!9999 1.5 Let's see how many iterations are actually required to reach the limit `1.5' with an error of at most `1e-15': ==> #takewhile (>1e-15) (map (1.5-) S) 32 This means that the sum of the first 31 series terms is needed to get an accuracy of 15 digits (which is the best that we can hope for with 64 bit floating point values). We can readily verify this using `iter': ==> sum (iter 31 (/3) 1) 1.5 The standard library also implements stream enumerations and comprehensions which work like the corresponding list and tuple constructs but are evaluated in a lazy fashion, and the interpreter provides syntactic sugar for these. In particular, the notation `{X..}' or `{X,Y..}' can be used to specify an infinite arithmetic sequence. For instance, here is another way to define the infinite geometric series from above: ==> def S = scanl (+) 0 {1/3^N : N in {0..}} ==> list (take 5 S) [0,1.0,1.33333333333333,1.44444444444444,1.48148148148148] It is worth noting here that streams are just one simple example of a lazy data structure, which happen to play an important role in many useful algorithms so that it makes sense to provide them as a builtin. The Q language makes it rather easy to define your own lazy data structures using so-called "special forms". In fact, special forms provide a uniform means to define both lazy data constructors and macro-like functions which take their arguments unevaluated, employing "call by name" parameter passing. This will be discussed in detail in *Note Special Forms::. Special forms are an integrated part of Q which adds a great amount of expressive power to the language. In particular, they allow specialized constructs such as a "short-circuit" conditional expressions to be defined in the standard library, just like "ordinary" functions, rather than having to provide them as builtins. One example is the conditional expression `if X then Y else Z' which is nothing but syntactic sugar for the standard library function `ifelse': ==> ifelse (5>0) "positive" "negative" "positive" ==> if 5>0 then "positive" else "negative" "positive" Another special form which is useful in many situations is `lambda' (as of Q 7.1, this one is actually provided as a builtin), which allows you to create little "anonymous" functions on the fly. The customary notation `\X.Y' is provided as a shorthand for `lambda X Y'. These constructs are also called "lambda abstractions", or simply "lambdas". For instance, the following interpreter command defines the well-known factorial function using a lambda abstraction and assigns the resulting function object to the variable `fac': ==> var fac = \N.if N>0 then N*fac (N-1) else 1 ==> fac \X1 . if X1>0 then X1*fac (X1-1) else 1 ==> map fac [1..10] [1,2,6,24,120,720,5040,40320,362880,3628800] Lambdas taking multiple arguments can be created like this: ==> \X Y.(1-X)*Y \X1 . \X2 . (1-X1)*X2 ==> _ 0.9 0.5 0.05 The lambda arguments can also be patterns to be matched against the actual parameters. For instance: ==> \(X,Y).(1-X)*Y \(X1,X2) . (1-X1)*X2 ==> _ (0.9,0.5) 0.05 Besides string and list processing functions and general utility functions of the kind sketched out above, the standard library also includes a collection of efficient "container" data structures which are useful in many applications. For instance, so-called "hashed dictionaries" (also known as "hashes" or "associative arrays") are implemented by the `hdict.q' script. The following expressions show how to create a dictionary object from a list, and how to access this dictionary using the index (`!'), `keys' and `vals' operations. (Note that here we have to load the `hdict.q' module explicitly, as it is not automatically included via the prelude. We could also import the `stdtypes.q' module instead, to get hold of all the container types.) ==> import hdict ==> def D = hdict [(foo,99),(bar,-1),(gnu,foo)]; D!gnu foo ==> keys D [foo,bar,gnu] ==> vals D [99,-1,foo] This completes our little tour through the standard library. To find out more, please refer to *Note The Standard Library::, or take a look at the scripts themselves.  File: qdoc.info, Node: Writing a Script, Next: Definitions, Prev: Using the Standard Library, Up: Getting Started 2.3 Writing a Script ==================== Now that we have taken the first big hurdle and are confident that we can actually get the Q interpreter to evaluate us some expressions, let us try to define a new function for use in the interpreter. That is, we are going to write our first own Q "program" or "script". In the Q language, there are actually two different ways to create new functions. As we have already seen in the previous sections, little "anonymous" functions are often computed on the fly by deriving them from existing functions using partial applications or lambda abstractions. The other way, which is often more convenient when the functions to be defined get more complicated, is the definition of named functions by the means of "equations", which works in a way similar to algebraic definitions in mathematics. We will discuss how to do this in the following. Having exited the Q interpreter, invoke your favourite text editor and enter the following simple script which implements the factorial function: fac N = N*fac(N-1) if N>0; = 1 otherwise; (Note that in order to define functions equationally, you really have to write a script file. For technical reasons there currently is no way to enter an equational definition directly in the interpreter.) Save the script in the file `fac.q' and then restart the interpreter as follows (here and in the following `$' denotes the command shell prompt): $ q fac.q You can also edit the script from within the interpreter (using `vi' or the editor named by the `EDITOR' environment variable) and then restart the interpreter with the new script, as follows: ==> edit fac.q ==> run fac.q In any case, now the interpreter should know about the definition of `fac' and we can use it like any of the built-in operations: ==> map fac [1..10] [1,2,6,24,120,720,5040,40320,362880,3628800] For instance, what is the number of 30-element subsets of a 100-element set? ==> fac 100 div (fac 30*fac 70) 29372339821610944823963760 As another example, let us write down Newton's algorithm for computing roots of a function. Type in the following script and save it to the file `newton.q': newton DX DY F = until (satis DY F) (improve DX F); satis DY F X = abs (F X) < DY; improve DX F X = X - F X / derive DX F X; derive DX F X = (F (X+DX) - F X) / DX; Restart the interpreter with the `newton.q' script and try the following. Note that the target function here is `\Y.Y^3-X' which becomes zero when `Y' equals the cube root of `X'. ==> var eps = .0001, cubrt = \X.newton eps eps (\Y.Y^3-X) X ==> cubrt 8 2.00000000344216 Well, this is not _that_ precise, but we can do better: ==> def eps = 1e-12 ==> cubrt 8 2.0  File: qdoc.info, Node: Definitions, Next: Runtime Errors, Prev: Writing a Script, Up: Getting Started 2.4 Definitions =============== As mentioned in the previous section, in the Q language function definitions take the form of equations which specify how a given expression pattern, the left-hand side of the equation, is transformed into a new expression, the right-hand side of the equation. In this section we elaborate on this some more to give you an idea about how these equational definitions work. Our explanations will necessarily be a bit sketchy at this point, but the concepts we briefly touch on in this section will be explained in much more detail later. The left-hand side of an equation may introduce "variables" (denoted by capitalized identifiers, as in Prolog) which play the role of formal parameters in conventional programming languages. For instance, the following equation defines a function `sqr' which squares its (numeric) argument: sqr X = X*X; An equation may also include a condition part, as in the definition of the factorial function from the previous section: fac N = N*fac(N-1) if N>0; = 1 otherwise; As indicated above, several equations for the same left-hand side can be factored to the left, omitting repetition of the left-hand side. Furthermore, the `otherwise' keyword may be used to denote the "default" alternative. The left-hand side of an equation may actually be an arbitrary compound expression pattern to be matched in the expression to be evaluated. For instance, as we have already seen, the expressions `[]' and `[X|Xs]' denote, respectively, the empty list and a list starting with head element `X' and continuing with a list `Xs' of remaining elements, just as in Prolog. We can use these patterns to define a function `add' which adds up the members of a list as follows: add [] = 0; add [X|Xs] = X+add Xs; Thus no explicit operations for "extracting" the members from a compound data object such as a list are required; the components are simply retrieved by "pattern matching". This works in variable definitions as well. For instance, in the interpreter you can bind variables to the head element and the tail of a list using a single definition as follows: ==> def [X|Xs] = [1,2,3]; X; Xs 1 [2,3] Pattern matching also works if the arguments take the form of function applications. For instance, the following definition implements an insertion operation on binary trees: insert nil Y = bin Y nil nil; insert (bin X T1 T2) Y = bin Y T1 T2 if X=Y; = bin X (insert T1 Y) T2 if X>Y; = bin X T1 (insert T2 Y) if X hd [] hd [] ==> sin (X+1) sin (X+1) You may be disconcerted by the fact that expressions like `hd []' and `sin (X+1)' actually denote legal values in the Q language. However, this is one of Q's key features as a term rewriting programming language, and it adds a great amount of flexibility to the language. Most importantly, it allows expressions, even expressions involving variables, to be evaluated in a symbolic fashion. For instance, you might add rules for algebraic identities and symbolic differentiation to your script and have the interpreter simplify expressions according to these rules. On the other hand, the Q language also allows you to refine the definition of any built-in or user-defined operation and thus it is a simple matter to add an "error rule" like the following to your script when needed: hd [] = error "hd: empty list"; (The `error' function is defined in the standard library script `error.q', see *Note Diagnostics and Error Messages::.) In order to implement more elaborate error handling, you can also use Q's built-in `throw' and `catch' functions to raise and respond to exceptions on certain error conditions. This is discussed in *Note Exception Handling::.  File: qdoc.info, Node: Runtime Errors, Prev: Definitions, Up: Getting Started 2.5 Runtime Errors ================== There are a few error conditions which may cause the Q interpreter to abort the evaluation with a runtime error message. A runtime error arises when the interpreter runs out of memory or stack space, when the user aborts an evaluation with the interrupt key (usually `Ctl-C'), and when the condition part of an equation does not evaluate to a truth value (`true' or `false'). (All these error conditions can also be handled by the executing script, see *Note Exception Handling::.) You can use the `break on' command to tell the interpreter that it should fire up the debugger when one of the latter two conditions arises. After `Ctl-C', you can then resume evaluation in the debugger, see *Note Debugging::. This is not possible if the interpreter stumbles across an invalid condition part; in this case the debugger is invoked merely to report the equation which caused the error, and to inspect the call chain of the offending rule. Hitting the carriage return key will return you to the interpreter's evaluation loop. For instance, reload the `fac.q' script from *Note Writing a Script::: fac N = N*fac(N-1) if N>0; = 1 otherwise; Now enable the debugger with the `break' command and type some "garbage" like `fac fac': ==> break on; fac fac ! Error in conditional 0> fac.q, line 1: fac fac ==> fac*fac (fac-1) if fac>0 (type ? for help) : What happened? Well, the debugger informs us that the error occurred when the interpreter tried to apply the first equation, fac N = N*fac(N-1) if N>0; to the expression `fac fac'. In fact this is no big surprise since we cannot expect the interpreter to know how to compare a symbol, `fac', with a number, `0', which it tried when evaluating the condition part of the above equation. So let us return to the interpreter's prompt by hitting the carriage return key: : ==> The interpreter now waits for us to type the next expression. (For a summary of debugger commands please refer to *Note Debugging::. You can also type `?' or `help' while in the debugger to obtain a list of available commands.)  File: qdoc.info, Node: Lexical Matters, Next: Scripts and Modules, Prev: Getting Started, Up: Top 3 Lexical Matters ***************** The vocabulary of the Q programming language consists of notations for identifiers, operators, integers, floating point numbers, character strings, comments, a few reserved words which may not be used as identifiers, and some special punctuation symbols which are used as delimiters. * Menu: * Whitespace and Comments:: * Identifiers and Reserved Words:: * Operator Symbols:: * Numbers:: * Strings:: * Unicode Support::  File: qdoc.info, Node: Whitespace and Comments, Next: Identifiers and Reserved Words, Prev: Lexical Matters, Up: Lexical Matters 3.1 Whitespace and Comments =========================== Whitespace (blanks, tabs, newlines, form feeds) serves as a delimiter between adjacent symbols, but is otherwise ignored. Comments are treated like whitespace: /* This is a comment ... */ Note that these comments cannot be nested. C++-style line-oriented comments are also supported: // C++-style comment ... Furthermore, lines beginning with the `#!' symbol denote a special type of comment which may be processed by the operating system's command shell and the Q programming tools. On UNIX systems, this (odd) feature allows you to execute Q scripts directly from the shell (by specifying the `q' program as a command language processor) and to include compiler and interpreter command line options in a script (see *Note Running Scripts from the Shell::).  File: qdoc.info, Node: Identifiers and Reserved Words, Next: Operator Symbols, Prev: Whitespace and Comments, Up: Lexical Matters 3.2 Identifiers and Reserved Words ================================== Identifiers are denoted by the usual sequences of letters (including `_') and digits, beginning with a letter. Upper- and lowercase is distinct. In the Q language, identifiers are used to denote "type", "function" and "variable" symbols. Type identifiers may start with either an uppercase or lowercase letter (the convention, however, is to use an initial uppercase letter). The case of the first letter matters, however, for function and variable symbols. As in Prolog, a capitalized identifier (such as `X', `Xmax' and `XMAX') indicates a variable symbol, whereas identifiers starting with a lowercase letter denote function symbols (unless they are declared as "free" variables, see below). Unlike in Prolog, the underscore `_' counts as a _lowercase_ letter, hence `_MAX' is a function symbol, not a variable. (The idea behind this is that it allows you to get a function symbol which appears to start with an uppercase letter by stropping it with an initial `_'.) However, as an exception to the general rule, the identifier `_' _does_ denote a variable symbol, the so-called "anonymous" variable. The same rules also apply to symbols created interactively in the interpreter. Variables actually come in two flavours: "bound" and "free" variables, i.e., variables which also occur on the left-hand side of an equation, and variables which only occur on the right-hand side and/or in the condition part of an equation. Identifiers may also be _declared_ as free variables; see *Note Declarations::. In this case, they may also start with a lowercase letter. Type, function and free variable identifiers may also be "qualified" with a "module" identifier prefix (cf. *Note Scripts and Modules::), to specifically denote a symbol of the given module. Such a qualified identifier takes the form `MODULE-ID::IDENTIFIER'; no whitespace or comments are allowed between the module name, `::' symbol and the function or variable identifier. Formally, the syntax of identifiers is described by the following grammatical rules: identifier : unqualified-identifier | qualified-identifier qualified-identifier : module-identifier '::' unqualified-identifier unqualified-identifier : variable-identifier | function-identifier | type-identifier module-identifier : letter {letter|digitsym} type-identifier : letter {letter|digitsym} variable-identifier : uppercase-letter {letter|digitsym} | '_' function-identifier : lowercase-letter {letter|digitsym} letter : uppercase-letter|lowercase-letter uppercase-letter : 'A'|...|'Z'|UPPERCASE UNICODE LETTER lowercase-letter : 'a'|...|'z'|'_'|LOWERCASE UNICODE LETTER digitsym : '0'|...|'9'|UNICODE DIGIT (Please refer to *Note Q Language Grammar::, for a description of the BNF grammar notation used throughout this document.) The following symbols are reserved words of the Q language and may not be used as identifiers: as const def else extern from if import include otherwise private public special then type undef var virtual where In addition, the following identifiers are predeclared as operator symbols (see *Note Operator Symbols::) and cannot be used as normal identifiers either: and div mod not or  File: qdoc.info, Node: Operator Symbols, Next: Numbers, Prev: Identifiers and Reserved Words, Up: Lexical Matters 3.3 Operator Symbols ==================== Operator symbols may either take the form of function identifiers or they may be sequences of punctuation characters (excluding `_' which serves as a letter in the Q language). Like function and free variable symbols, they may be qualified with a module identifier prefix: op : unary-op|binary-op unary-op : opsym binary-op : opsym|'and' 'then'|'or' 'else' opsym : unqualified-opsym | qualified-opsym qualified-opsym : module-identifier '::' unqualified-opsym unqualified-opsym : function-identifier | punctsym {punctsym} punctsym : UNICODE PUNCTUATION CHARACTER In either case, operator symbols _must_ be declared explicitly before they can be used, cf. *Note Declarations::. The declaration determines the precedence and "fixity" of the operator (i.e., whether it acts as a unary prefix or a binary infix operator; see *Note User-Defined Operators::, for more information on this). As already mentioned, the following identifiers are predefined as built-in operators: and div mod not or As indicated by the syntax rules, the logical connectives `and' and `or' can also be combined with the keywords `then' and `else' to form the "short-circuit" connectives `and then' and `or else'. The punctuation symbols predefined as built-in operators are the following: ` ' ~ & . || < > = <= >= <> == ++ + - * / ^ ! # $ Most of these can actually be redeclared by the programmer for his own purposes. However, there are a few symbols, called "soft delimiters", which play a special role in the syntax of the Q language (some of these are also used as operator symbols): ~ . .. : | = == - \ @ The soft delimiters may occur _inside_ user-defined operators, but if they are used as separate lexemes then they are treated like reserved words and thus they may not be declared as operator symbols (except for the purpose of making aliases, cf. *Note Declarations::). The same applies to the reserved keywords (see *Note Identifiers and Reserved Words::). Moreover, the following special symbols serve as "hard delimiters" in the Q language which always separate lexemes and thus may not occur inside operator symbols at all: " , ; :: ( ) [ ] { } The same applies to whitespace and other non-printable characters, as well as the comment delimiters (`//' and `/*' as well as initial `#!' on a script line). Symbols consisting of punctuation are generally parsed using the "longest possible lexeme" a.k.a. "maximal munch" rule. Here, the "longest possible lexeme" refers to the longest prefix of the input such that the sequence of punctuation characters either forms a _valid_ (i.e., declared) operator symbol, or one of the reserved and special delimiter symbols listed above. Thus, e.g., `..#' will usually be parsed as `.. #', i.e., a reserved `..' symbol followed by a `#' operator. This holds unless the entire sequence `..#' has already been declared as an operator in the scope where it is used. The only exception to the "maximal munch" rule occurs inside declarations where a new operator symbol is being introduced. In this case the symbol extends up to the next hard delimiter symbol (usually `)' or `;') or whitespace/non-printable character in the input. For instance, the following Q code snippet declares a new binary operator symbol `+~%' (please refer to *Note Declarations::, for an explanation of the declaration syntax): public (+~%) X Y; Another special case arises with conglomerates of three or more consecutive `:' symbols. In this case the initial `::' is always treated as the designation of a qualified (operator) symbol. (In the unlikely case that you ever need to specify a "guarded variable" of the form `X : ::Type' with a qualified type in the built-in namespace, the space between the `:' token and the following `::' qualification designator is mandatory.)  File: qdoc.info, Node: Numbers, Next: Strings, Prev: Operator Symbols, Up: Lexical Matters 3.4 Numbers =========== Signed decimal numeric constants are denoted by sequences of decimal digits (only the standard digits 0..9 in the ASCII character set may be used here) and may contain a decimal point and/or a scaling factor. Integers can also be denoted in octal or hexadecimal, using the same syntax as in C: number : ['-'] unsigned-number unsigned-number : '0' octdigitseq | '0x' hexdigitseq | '0X' hexdigitseq | digitseq ['.' [digitseq]] [scalefact] | [digitseq] '.' digitseq [scalefact] digitseq : digit {digit} octdigitseq : octdigit {octdigit} hexdigitseq : hexdigit {hexdigit} scalefact : 'E' ['-'] digitseq | 'e' ['-'] digitseq digit : '0'|...|'9' octdigit : '0'|...|'7' hexdigit : '0'|...|'9'|'a'|...|'f'|'A'|...|'F' Simple digit sequences without decimal point and scaling factor are treated as integers; if the sequence starts with `0' or `0x'/`0X' then it denotes an integer in octal or hexadecimal base, respectively. Other numbers denote (decimal) floating point values. If a decimal point is present, it must be preceded or followed by at least one digit. Both the scaling factor and the number itself may be prefixed with a minus sign. (Syntactically, the minus sign in front of a number is interpreted as unary minus, cf. *Note Expressions::. However, if unary minus occurs in front of a number, it is interpreted as a part of the number and is taken to denote a negative value. See the remarks concerning unary minus in *Note Expressions::.) Some examples: 0 -187326 0.0 -.05 3.1415e3 -1E-10 0177 0xaf -0XFFFF  File: qdoc.info, Node: Strings, Next: Unicode Support, Prev: Numbers, Up: Lexical Matters 3.5 Strings =========== String constants are written as character sequences enclosed in double quotes: string : '"' {char} '"' char : ANY CHARACTER BUT NEWLINE AND `"' To include newlines, double quotes and other (non-printable) characters in a string, the following escape sequences may be used: \n newline \r carriage return \t tab \b backspace \f form feed \" double quote \\ backslash Furthermore, a character may also be denoted in the form `\N', where `N' is the character number in decimal, hexadecimal or octal (using the same syntax as for unsigned integer values). Note that the character number may consist of an arbitrary number of digits; the resulting value will be taken modulo the size of the local character set, which is 256 in the case of ASCII-only systems and 0x110000 if the interpreter has been built with Unicode support (see *Note Unicode Support::). Optionally, the character number may also be enclosed in parentheses; this makes it possible to specify a string in which a character escape is immediately followed by another character which happens to be a valid digit, as in `"\(123)4"'. As of version 7.11 and later, the interpreter also supports symbolic character escapes of the form `\&NAME;', where NAME is any of the XML single character entity names specified in the "XML Entity definitions for Characters", see `http://www.w3.org/TR/xml-entity-names/'. Note that, at the time of this writing, this is still a W3C working draft, so the supported entity names may be subject to change until the final specification comes out; the currently supported entities are described in the draft from 14 December 2007, see `http://www.w3.org/TR/2007/WD-xml-entity-names-20071214/'. Also note that multi-character entities are _not_ supported in this implementation. A string may also be continued across lines by putting the `\' character immediately before the end of the line, which causes the following newline character to be ignored. As of Q 7.0, it is a syntax error if a character escape is not recognized as either a numeric escape or one of the special escapes listed above. Some examples: "" empty string "A" single character string "\27" ASCII escape character (ASCII 27) "\033" same in octal "\0x1b" same in hexadecimal "\(0x1b)c" ASCII escape followed by a literal `c' character "Gr\äf" German umlaut, using XML entity escape "a string" multiple character string "a \"quoted\" string" include double quotes "a line\n" include newline "a very \ continue across line end long line\n"  File: qdoc.info, Node: Unicode Support, Prev: Strings, Up: Lexical Matters 3.6 Unicode Support =================== Since version 7.0 the Q interpreter has full Unicode support. This means that identifiers may actually contain arbitrary alphanumeric letter and digit symbols from the entire Unicode character set. (All non-uppercase alphabetical letters are considered as lowercase letters which may begin a function identifier.) Likewise, operator symbols may consist of arbitrary punctuation characters in the Unicode character set. (We refrain from giving any concrete examples here, to keep this document 7-bit clean.) Moreover, all string values are internally represented using the UTF-8 encoding, which allows you to represent all characters in the Unicode character set, while retaining backward compatibility with 7 bit ASCII. String literals in the source script will be translated from the system encoding to UTF-8 automatically and transparently, and the `\N' notation may specify any valid Unicode point. For this to work, the interpreter must have been built with Unicode support enabled (which is the default).  File: qdoc.info, Node: Scripts and Modules, Next: Declarations, Prev: Lexical Matters, Up: Top 4 Scripts and Modules ********************* The basic compilation unit in the Q language is the "script", which is simply a (possibly empty) sequence of declarations and definitions in a single source file: script : {declaration|definition} In order to make a given set of definitions available for use in the interpreter, you can just collect the relevant declarations, variable definitions and equations in a script which you submit to the interpreter for execution. The interpreter in turn invokes the compiler to translate the script into a "bytecode" file which is loaded by the interpreter (see *Note Using Q::). A script may also be empty, in which case it does not provide any definitions at all. If you put all your definitions into a single script, that's all there is to it. However, you will often want to break down a larger script into a collection of smaller units which can be managed separately and which are linked together by the compiler into a single bytecode file. To these ends the Q language allows you to put your definitions into several separate script files, which are also called "modules". To gain access to the function, variable and type symbols provided by another module, you must then use an "import" or "include" declaration, using the following syntax: declaration : unqualified-import | qualified-import unqualified-import : 'import' module-spec {',' module-spec} ';' | 'include' module-spec {',' module-spec} ';' qualified-import : 'from' module-spec 'import' [symbol-specs] ';' | 'from' module-spec 'include' [symbol-specs] ';' module-spec : module-name ['as' unqualified-identifier] module-name : unqualified-identifier | string symbol-specs : symbol-spec {',' symbol-spec} symbol-spec : unqualified-identifier ['as' unqualified-identifier] | unqualified-opsym ['as' unqualified-opsym] As of version 7.8, Q supports both "unqualified" and "qualified" import clauses. The former allows you to quickly import an entire collection of modules, while the latter gives you precise control over which symbols are to be imported from which modules. We describe each of these in turn. * Menu: * Unqualified Imports:: * Qualified Imports:: * Implicit Imports and the Prelude:: * The Global Namespace::  File: qdoc.info, Node: Unqualified Imports, Next: Qualified Imports, Up: Scripts and Modules 4.1 Unqualified Imports ======================= Easiest first, let as take a look at unqualified imports. The following declaration imports two modules `foo' and `bar': import foo, bar; If you put such declarations into your "main script" and then run the script with the interpreter, the imported modules will be loaded as well. To determine which functions, variables and types are actually available to be imported into another script, the Q language allows you to declare symbols either as "public" or "private", see *Note Declarations::. Outside a module, only the public symbols are accessible when the module is imported in another module. Normally import is not "transitive", i.e., importing a module `foo' does _not_ automatically give you access to the symbols imported by `foo', but only to the public symbols declared in `foo' itself. To work around this, instead of `import' you can also use an `include' declaration which causes all imports and includes of the included module to be "reexported". For instance, let us consider the case that module `foo' includes module `bar': include bar; Then by importing module `foo' you also gain access to the public symbols of module `bar' (and, recursively, to the modules included by `bar', etc.). This provides a means to group different modules together in one larger "umbrella" module. For instance, the standard prelude script (cf. *Note The Standard Library::) simply includes most of the other standard library modules. In the Q language, each module has its own separate namespace. This means that two different modules, say `foo1' and `foo2', may both provide their own public function symbol named `foo'. If both `foo1' and `foo2' are imported in the same script, you can distinguish between the two by using a "qualified identifier", using the module name as a prefix, e.g., `foo1::foo' or `foo2::foo'. (Writing simply `foo' in this case will produce an error message because the reference is ambiguous.) Import and include declarations can occur anywhere in a script (and will be active from this point on up to the end of the script file), but it is common (and recommended) practice to put them near the beginning of the script. As in most other programming languages, module dependencies must always be acyclic. If the compiler finds a cyclic chain of `import' and `include' declarations, such as a module importing itself or a module `foo' importing a module `bar' which in turn imports `foo' again, it produces an error message. Another caveat: When using unqualified import clauses, it is much too easy to accidentally "reuse" an imported symbol because of a missing local symbol declaration. For instance, if you import a module `foo' which happens to export a symbol `bar', and then you later define a function `bar' in your script, your definition will use the imported `bar' symbol. This is perfectly legal in Q, and may be intended, but if it's not then you have to explicitly declare a new local `bar' symbol after the import clause, cf. *Note Declarations::. If you forget this, you will errorneously redefine the existing `bar' function instead. One way to avoid this is to always use qualified imports, as described below. Another possibility is to run the interpreter with the `--pedantic' option, in which case it will warn you about symbols from unqualified imports which are referred to with unqualified identifiers, see *Note Running Compiler and Interpreter::, for details. As indicated in the syntax rules, the names of the modules to be imported can either be specified using an unqualified identifier, or a string denoting a full or relative pathname, e.g.: import "/home/ag/q/stuff.q"; In this case, you can also omit the `.q' suffix, it will be supplied automatically when necessary. Moreover, the module identifier is automatically the basename of the script filename (i.e., `stuff' in the above example); therefore, it is a good idea to choose basenames which are legal Q identifiers so that you can use them in qualified symbols. If no absolute path is specified, the interpreter locates script files using its "search path", which usually contains the current directory and other system-dependent or user-defined locations where "library" scripts are kept, see *Note Using Q::, for details. The `as' keyword can be used to explicitly specify an unambiguous module name. This is useful, in particular, if the basename of the module is _not_ a valid identifier, and for the purpose of resolving name clashes. For instance: import "my/stdlib.q" as mystdlib; Simply importing `"my/stdlib.q"' under its own name in this case would produce an error message, because the name of the module collides with the standard library module of the same name, and the compiler enforces that all module names are unique.  File: qdoc.info, Node: Qualified Imports, Next: Implicit Imports and the Prelude, Prev: Unqualified Imports, Up: Scripts and Modules 4.2 Qualified Imports ===================== A qualified import takes the form: from foo import foo, BAR; This specifically imports the symbols `foo' and `BAR' from module `foo', and nothing else. What this actually does is to "redeclare" the imported symbols in the namespace of the importing module, as explained in *Note Cross-Checking of Declarations and Aliases::, so that they look like private symbols of the importing module. Thus you can refer to them, e.g., simply as `foo' or with the qualified name `gnats::foo', where `gnats' is the name of the _importing_ module. Note that `foo::foo' will _not_ work in this case, because a qualified import does _not_ bring the module itself into scope. (However, as pointed out below you can use both a qualified _and_ an unqualified import in concert to make both `gnats::foo' and `foo::foo' work.) A note on terminology: In Q parlance, the terms "qualified import" and "unqualified import" apply to the import clauses themselves; a qualified import clause is called "qualified" simply because it restricts the set of imported symbols. Unqualified and qualified _identifiers_ can be used with both styles of import clauses alike. As with unqualified import, there is a second kind of qualified import clause which also reexports the imported symbols, by making them public members of the importing module: from foo include foo, BAR; For convenience, you can also write: from module import; or from module include; without listing any symbols to just import or include _all_ symbols of the given module qualified. (It goes without saying that this should be used with care.) This is roughly equivalent to `import module' or `include module', respectively, but the imported symbols are made available as members of the namespace of the client, not the imported module. If this sounds a bit confusing, here is an example which should clarify the differences between unqualified and qualified imports. Let's assume the following three modules `A', `B' and `C': /* A: */ import B; /* B: */ include C; /* C: */ public foo; Using this setup, the symbol `foo' is available as either just `foo' or as `C::foo' in all of `A', `B' and `C'. Now suppose we change module `B' to use a qualified `include' instead: /* B: */ from C include foo; /* or just: from C include; */ `C''s namespace hasn't changed, of course, but in both `A' and `B' the symbol `foo' is now available as `foo' or `B::foo', but _not_ as `C::foo' anymore. However, you can also combine unqualified and qualified imports, like this: /* B: */ include C; from C include foo; Now the symbol `foo' is available as either `foo', `B::foo' _or_ `C::foo' in `B' and `C' (and the compiler will recognize it as the same symbol no matter which notation you use). Finally, note that since symbols imported using a qualified clause become members of the importing namespace, they must not collide with other symbols declared either locally or in another qualified import clause. Thus, if you have two modules `foo1' and `foo2' which both export their own (local) symbol `foo', the following will provoke an error message from the compiler because of the clash between the two different `foo' symbols: from foo1 import foo; from foo2 import foo; In such cases you must either use unqualified import, or employ an `as' clause to rename the symbols while importing them. E.g.: from foo1 import foo as foo1; from foo2 import foo as foo2; Note that this is only a problem if the imported symbols are really distinct. If the symbols are just different "aliases" of the same symbol defined elsewhere, then you can just import them under the same name into the same scope without any problems.  File: qdoc.info, Node: Implicit Imports and the Prelude, Next: The Global Namespace, Prev: Qualified Imports, Up: Scripts and Modules 4.3 Implicit Imports and the Prelude ==================================== Actually, there is yet another kind of import, namely the "implicit import" of the `prelude.q' script at the beginning of each script, which in turn includes most standard library modules (see *Note The Standard Library::) and makes them readily available in your program, without having to use an explicit import declaration. When looking up an unqualified symbol in a given script, the compiler first searches for a symbol defined in that script, then for a symbol in an imported or included module, then for a symbol defined by the prelude and its includes, and finally for a built-in symbol. You can instruct the compiler to inhibit the default import of the prelude, by using the `--no-prelude' option, see *Note Using Q::. Moreover, you can also override the standard prelude with your own, if it occurs on the search path _before_ the standard prelude, see *Note Setting up your Environment::.  File: qdoc.info, Node: The Global Namespace, Prev: Implicit Imports and the Prelude, Up: Scripts and Modules 4.4 The Global Namespace ======================== The namespace available in the interpreter is always that of the main script, i.e., the script given on the command line. This namespace also includes the "built-in namespace" which contains all the built-in symbols, as well as the implicit imports, i.e., the prelude and all its includes (unless the `--no-prelude' option was used). The built-in namespace can be accessed explicitly by using an empty module qualifier, as in `::sin'. Qualified prelude symbols use the corresponding module name as the qualifier (e.g., `stdlib::cat'), just as with explicit imports. The interpreter also allows you to dynamically import additional modules in the global scope using the `import' command, see *Note Command Language::. Moreover, to facilitate testing and debugging, in the interpreter it is possible to gain access to _all_ public and private symbols of the program (also in modules not directly imported in the main script) using qualified identifiers.  File: qdoc.info, Node: Declarations, Next: Expressions, Prev: Scripts and Modules, Up: Top 5 Declarations ************** The Q language allows you to declare function and (free) variable symbols explicitly, by means of the syntactic constructs discussed in this chapter. Symbol declarations are mostly optional; if you introduce a new symbol without declaring it, the compiler declares it for you. However, you will sometimes wish to ensure that a new symbol is created to override a symbol of the same name from an imported module, or you want to attach special attributes to a symbol, and then you have to use an explicit declaration. Explicit declarations are also needed to introduce new operator and type symbols. Moreover, while implicit declarations are often convenient when you want to get something done quickly, for larger projects you might prefer to play it safe and ensure that each function symbol actually has a proper explicit declaration. In such cases you can run the compiler with the `--pedantic' option (see *Note Running Compiler and Interpreter::) to warn you about undeclared occurrences of an identifier. * Menu: * Declaration Syntax:: * Operator Declarations:: * Cross-Checking of Declarations and Aliases:: * Type Declarations::  File: qdoc.info, Node: Declaration Syntax, Next: Operator Declarations, Prev: Declarations, Up: Declarations 5.1 Declaration Syntax ====================== Syntactically, symbol declarations take the following form: declaration : prefix headers ';' | [scope] 'type' unqualified-identifier [':' identifier] ['=' sections] ';' | [scope] 'extern' 'type' unqualified-identifier [':' identifier] ['=' sections] ';' | [scope] 'type' qualified-identifier ['as' unqualified-identifier] ';' | [scope] 'type' unqualified-identifier '==' identifier ';' prefix : scope | [scope] modifier {modifier} scope : 'private'|'public' modifier : 'const'|'special'|'extern'|'var'|'virtual' headers : header {',' header} header : unqualified-identifier '=' expression | unqualified-identifier {['~'] variable-identifier} | qualified-identifier {['~'] variable-identifier} ['as' unqualified-identifier] | '(' unqualified-opsym ')' {['~'] variable-identifier} ['@' precedence] | '(' qualified-opsym ')' {['~'] variable-identifier} ['@' precedence] ['as' unqualified-opsym] precedence : unsigned-number | '(' operator ')' sections : section {'|' section} section : [prefix] headers For instance, the following are all valid symbol declarations: public foo; private extern bar X; public special lambda X Y; special cond::ifelse ~P X Y as myifelse; public (--) Xs Ys; private (::++) Xs Ys as concat; const red, green, blue; var FOO, BAR; var const e = exp 1; type Day = const sun, mon, tue, wed, thu, fri, sat; public type BinTree = virtual bintree Xs | private const nil, bin X T1 T2; The keywords `private' and `public' specify the "scope" of a symbol. "Public" symbols are accessible outside a script, and can be imported by other modules (see *Note Scripts and Modules::). If the keywords `private' and `public' are omitted, the scope defaults to private. The `special' keyword serves to declare "special forms", which are described in *Note Special Forms::. Function symbols declared with `const' introduce "constant" or "constructor" symbols; the compiler enforces that expressions created with such symbols are not redefined in an equation (cf. *Note Equations::). The built-in constants `true', `false', `[]' and `()' are already predeclared as `const'. The `virtual' keyword serves to declare "virtual constructors" which can be used to define concrete representations (so-called "views") of abstract data types, see *Note Views::. Non-`const' function symbols can also be declared as `extern', meaning that the corresponding function is actually implemented by a corresponding "external" module, see *Note C Language Interface::. The `var' keyword allows you to declare "free" variable symbols, which can be assigned a value by means of a "variable definition", see *Note Free Variables::. If you use such a declaration, the variable symbol may also start with a lowercase letter (if it has not already been declared as a function symbol); note that without such a declaration, an identifier starting with a lowercase letter will implicitly be declared as a function symbol. (The idea behind this is that you can make a variable look like a function symbol, if the variable is actually supposed to be used for function values.) Variable symbols can also be declared as `const'; in this case the variable can only be assigned to _once_ and always refers to the same value once it has been defined. The built-in variables `INPUT', `OUTPUT', `ERROR' and `ARGS' are predeclared as `const', see *Note Command Language::. A variable declaration may also contain an "initializer" of the form `= X', where `X' is an expression. This has the effect of declaring and defining the variable in a single step. As indicated, the scope-modifier prefix of a declaration is followed by a comma-separated list of "headers". The first identifier in each header states the symbol to be declared. In function symbol declarations the function identifier may be followed by a list of variable identifiers which are used to denote the arguments of the declared function symbol. The variables are effectively treated as comments; only their number (called the "arity" of a function symbol) is recorded to check the consistency of different declarations of the same symbol, and to specify the number of "special" arguments in a `special' declaration. In `special' declarations variables may also be prefixed with `~' to declare them as "non-special" arguments; see *Note Special Forms::. The first declaration of an unqualified identifier or operator symbol in a module always introduces a new symbol in the module's namespace. By default (if no explicit declaration precedes the first use of a new, unqualified symbol), the compiler automatically declares it as a `private' nullary function or variable symbol, depending on the case of the initial letter of the identifier (as already mentioned in *Note Lexical Matters::, capitalized identifiers are interpreted as variable symbols, others as function symbols).  File: qdoc.info, Node: Operator Declarations, Next: Cross-Checking of Declarations and Aliases, Prev: Declaration Syntax, Up: Declarations 5.2 Operator Declarations ========================= As of Q 6.2, it is also possible to declare new prefix and infix "operator symbols". Such declarations are never optional; operator symbols must always be declared before they are used. Syntactically, they take exactly the same form as a function symbol declaration, but the operator symbol is given inside parentheses. As discussed in *Note Lexical Matters::, the operator symbol may either be a function identifier or a sequence of punctuation characters. Examples: public (myop) X Y; public (--) X Y; You can also specify a precedence level, like so: public (myop) X Y @ 2; // use explicit precedence level ... public (myop) X Y @ (<); // ... or precedence level of given operator If no precedence is given, it defaults to 2, which is the precedence of the relational operators. This is described in more detail in *Note User-Defined Operators::.  File: qdoc.info, Node: Cross-Checking of Declarations and Aliases, Next: Type Declarations, Prev: Operator Declarations, Up: Declarations 5.3 Cross-Checking of Declarations and Aliases ============================================== The Q language allows multiple declarations of the same symbol, and the compiler will verify the consistency of all declarations of a given symbol. That is, if a symbol is declared with different scope (`private' or `public'), attributes (like `var', `const' or `special') or different number of arguments, or if non-special arguments of a special form are declared differently, then the compiler will issue an error message. Note, however, that the compiler will _never_ cross-check the number of parameters in a function symbol declaration with the actual number of arguments to which the function is applied when it is used in an equation or a variable definition. This is because in Q it is perfectly legal to have a function symbol applied to a varying number of arguments for the purpose of constructing partial applications. Also note that if a local function symbol is first used without an explicit declaration then it will be implicitly declared to have a zero argument count. If the symbol is then later declared with a nonzero number of arguments or other non-default attributes then the compiler will print an error message. As indicated by the syntactic rules, it is also possible to "redeclare" a _qualified_ symbol. This requires that the target module either is the current module or has already been imported in the current scope using an unqualified import clause, and causes the compiler to both cross-check the declaration with the declaration in the imported module and redeclare the symbol within the current namespace. Such a redeclaration serves several different purposes. First and foremost, it allows you to ensure that an imported symbol was actually declared with the given attributes. Second, it lets you resolve name clashes by redeclaring the symbol in the current scope where it overrides unqualified imports of other modules; if you want, you can also import the symbol under a new name using an `as' clause (the new symbol is then also referred to as an "alias" of the original one). In these two cases the symbol will normally be redeclared as a private symbol. Third, by redeclaring a symbol as `public', you cause the symbol to be "reexported" by the current module. Examples: import gnats; private gnats::foo X Y; // cross-check declaration of gnats::foo public gnats::foo X Y as bar; // reexport gnats::foo as bar Of course, the same also works with operator symbols. E.g., here is how you declare a new operator symbol `concat' which behaves exactly like the built-in `++' operator: public (::++) X Y as concat; There is yet another usage of an imported symbol redeclaration, namely the `extern' redeclaration. This is only possible with function symbols and if the redeclared symbol is not already declared `extern' by another module. It instructs the compiler that this module provides an external definition of the symbol which will override equational definitions in other modules, see *Note C Language Interface::, for details. Note that, as of Q 7.8, most of the uses of qualified symbol redeclarations are now covered in a more convenient fashion by qualified imports, see *Note Qualified Imports::. Thus qualified symbol redeclarations will now mostly be used for cross-checking declarations and for external overrides.  File: qdoc.info, Node: Type Declarations, Prev: Cross-Checking of Declarations and Aliases, Up: Declarations 5.4 Type Declarations ===================== A "type declaration" introduces a type identifier, an optional supertype for the type, and an optional list of "constructor" symbols for the type. It associates the function symbols in the list with the given type. The symbol list is a `|'-delimited list of individual "sections". Each section takes the form of an ordinary symbol declaration, consisting of scope/modifier prefix and a comma-delimited list of headers. The prefix is optional; by default, a symbol has the same scope as the type it belongs to. To change scope and attributes of a symbol on the list, you use an appropriate scope/modifier prefix. For instance, you can declare a public `BinTree' type with private constructors `nil' and `bin' as follows: public type BinTree = private const nil, bin X T1 T2; Each constructor symbol may only be declared once; otherwise the compiler will issue an error message. Hence the compiler enforces that a constructor symbol only belongs to a single type, and that the number of arguments of that symbol is determined uniquely. A constructor symbol may also be declared as `virtual', which indicates that it may be used in concrete representations of abstract data types, so-called "views". Such symbols are usually non-`const' symbols implementing a public construction function which delivers values of the type they belong to, see *Note Views::, for details. For instance, to add a virtual constructor `bintree' to the above `BinTree' type, you might declare the type as follows: public type BinTree = virtual bintree Xs | private const nil, bin X T1 T2; Type identifiers may begin with either an upper- or lowercase letter (the convention, however, is to use capitalized identifiers). There may only be a single declaration for each type. As of Q 7.8, type identifiers must also be distinct from function or variable symbols declared in the same namespace, as both type and function/variable identifiers may now occur in the same context (namely, a qualified import clause). Types are used on the left-hand side of equations to restrict the set of expressions which can be matched by a variable; see *Note Type Guards::, for details. The Q language has a number of predefined type symbols which distinguish the corresponding types of objects built into the language: `Bool', `Int', `Float', `Real' (which is the supertype of both `Int' and `Float'), `Num' (the supertype of `Real'), `String', `Char' (the subtype of `String' denoting the single-character strings), `File', `List', `Stream', `Tuple' and `Function'. (Besides this, there are also two built-in types for exceptions, see *Note Exception Handling::.) Like function symbols, types imported from other modules can also be redeclared (possibly under a new name), and reexported, e.g.: public type array::Array as MyArray; In this case, no constructor symbols or supertype are specified. (As of Q 7.8, it is also possible to use a qualified import clause to achieve this, see *Note Qualified Imports::. Also note that if a type is imported using a qualified import clause, then its constructor symbols will _not_ be imported automatically; you will have to explicitly list them in the import clause if you need them.) As of Q 7.1, there is an alternative, more customary syntax for introducing type aliases which looks as follows: type Integer == Int; Note that here the alias to be defined is specified first, and the existing type to be aliased can be specified without a module qualifier. This works exactly like an ordinary alias declaration, but resembles more closely the syntax commonly used for defining type synonyms in other languages like Pascal and Haskell. Types can also be declared as `extern', to indicate that they are realized in a corresponding C module, as described in *Note C Language Interface::. External types may have a supertype, but only virtual constructors. For instance: extern type Bar = virtual bar X; In difference to function symbols, an existing type imported from another module cannot be redeclared as `extern'. Therefore an external definition must always be given by the module which originally declares the type.  File: qdoc.info, Node: Expressions, Next: Equations and Expression Evaluation, Prev: Declarations, Up: Top 6 Expressions ************* The notion of expressions is an intrinsic part of most, if not all, programming languages. However, in most programming languages expressions are merely treated as descriptions of computations to be performed in order to obtain the value of an expression, which usually belongs to some basic type provided by the language. In contrast, in the Q language expressions are objects in their own right which, as we shall see, are manipulated according to computational rules specified by the programmer. As usual, expressions are built from certain kinds of basic objects. In the Q language, these are integers, floating point numbers, strings, function and variable symbols. These objects are combined into larger, compound expressions by means of applications, tuples, lists and streams. For convenience, the Q language also has prefix, infix and mixfix operator symbols for the most common operations; these are actually treated as "syntactic sugar" for applications. Syntactic sugar is also provided for list, tuple and stream enumerations and comprehensions. The syntax of all these entities is described by the following grammatical rules: expression : identifier | 'var' unqualified-identifier | variable-identifier ':' identifier | number | string | expression expression | unary-op expression | expression binary-op expression | 'if' expression 'then' expression ['else' expression] | '\' expression { expression } '.' expression | '(' [element-list|enumeration|comprehension] ')' | '[' [element-list|enumeration|comprehension] ']' | '{' [element-list|enumeration|comprehension] '}' | '(' op ')' | '(' expression binary-op ')' | '(' binary-op expression ')' element-list : expression-list [',' | ';' | '|' expression] enumeration : expression-list '..' [expression] comprehension : expression ':' expression-list expression-list : expression {',' expression} | expression {';' expression} * Menu: * Constants and Variables:: * Applications:: * Lists Streams and Tuples:: * Built-In Operators:: * User-Defined Operators::  File: qdoc.info, Node: Constants and Variables, Next: Applications, Prev: Expressions, Up: Expressions 6.1 Constants and Variables =========================== The nonterminals `identifier', `op', `unary-op', `binary-op', `number' and `string' occuring in the syntax rules above refer to the lexical entities introduced in *Note Lexical Matters::. Note that the Q language actually distinguishes two different types of identifiers (function and variable identifiers), two different kinds of operator symbols (unary and binary) and two different types of numeric quantities (integers and floating point numbers). Integers are implemented as "bignums" using the GNU multiprecision package (GMP); thus the size of an integer value is only limited by available memory. Floating point values are implemented using 64 bit (i.e., double precision) floating point numbers; on most machines, these should provide nonzero absolute values ranging from 1.7E-308 to 1.7E308 and a precision of 15 decimal digits. String constants are character sequences internally represented as character arrays permitting constant-time access to the individual characters of a string. There is no a priori length restriction for string constants. In the current implementation, the null character `\0' is reserved as a string terminator and must not be used as an ordinary string character. Furthermore, the Q language provides five other built-in constants which denote the truth values (`true' and `false'), and the empty list, stream and tuple (`[]', `{}' and `()', to be discussed in *Note Lists Streams and Tuples::). A variable symbol can be "bound" or "free", depending on the context in which it occurs. We say that a variable is "bound" in an equation if it occurs on the left-hand side of the equation. Otherwise, the variable is a "free" variable. In this case, the variable may denote a value (introduced with `def'), or simply stands for itself. In any case, a variable is a legal atomic expression which may stand wherever a constant is allowed. On the _left-hand side_ of equations, it is possible to declare that a variable is of a given "type" by using the notation: `VARIABLE:TYPE'. This requires that you have previously declared TYPE as a type identifier, see *Note Declarations::. When a type identifier is specified with a variable, the variable will only match values belonging to the given type, cf. *Note Type Guards::. On the _right-hand side_ of equations, an unqualified identifier can also be prefixed with the keyword `var' to indicate that the identifier is to be treated as a free variable symbol. Such "inline" variable declarations can be used to escape free variables (if the same identifier is also introduced as a bound variable on the left-hand side of the equation) and to declare free variable symbols on the fly, see *Note Local Variables::, for details.  File: qdoc.info, Node: Applications, Next: Lists Streams and Tuples, Prev: Constants and Variables, Up: Expressions 6.2 Applications ================ Application is probably the most important single construct in the Q language. It allows you to apply one object (the "function") to another one (the "argument"). This construct is used to denote "function calls" such as `sqrt 2' as well as "constructor terms" such as `bin X T1 T2' which encode tree-like data structures. Also, as we will see in *Note Built-In Operators::, operators like `+', `-', etc. are merely syntactic sugar for applications. As in other contemporary functional languages, application is a binary operation written simply as juxtaposition: `X Y' denotes the application of `X' to `Y', both of which may be arbitrary expressions themselves. Application associates to the left; the construct `X Y1 ... Yn' is equivalent to `(...((X Y1) Y2) ...) Yn', and denotes the application of `X' to n arguments `Y1', ..., `Yn'. This style of writing function applications is commonly referred to as "currying", after the American logician H.B. Curry. We will have to say more about this shortly. Here are some valid examples of applications: sqrt 2 sqrt (Y+1) foo X (bar (Y-1)) Z Note that since application is left-associative, nested applications in arguments must be set off with parentheses. For instance, `foo (bar X)' applies `foo' to `bar X', whereas `foo bar X' applies `foo' to two arguments `bar' and `X'. Since currying is so ubiquitous in functional programming, you should be well familiar with it, so let us briefly explain what it is, and what it is good for. Functions of multiple arguments can generally be defined in two different ways: * As a function taking a single, structured argument (usually a tuple). This is the traditional, "uncurried" method one commonly encounters in mathematics. For instance, we might define the function `max', which computes the maximum of two values, as follows: max (X,Y) = X if X>Y; = Y otherwise; * Recursively, as a function which, when given the first argument, yields another function of the remaining arguments. Such functions are called "curried". For instance, here is a curried definition for the `max' function: max X Y = X if X>Y; = Y otherwise; Here `max X Y' is to be read as `(max X) Y', meaning that by applying `max' to the first argument `X', we obtain another function `max X', which, when applied to the second argument `Y', yields the maximum of `X' and `Y'. (Incidentally, this curried form is also the way that the `max' function is actually defined in the standard library. In fact, most built-in and standard library functions use the curried form as it is more flexible.) The Q language supports both kinds of notations. Choosing between the two is more than just a matter of taste. Besides saving parentheses, curried functions provide a cheap way to derive new functions through "partial applications" of a multi-argument function. For instance, given the curried definition of `max' from above, `max 0' can be used to denote the function computing the "nonnegative part" of its argument (which is the argument itself if it is nonnegative, and zero otherwise). This does not work with the uncurried definition since that definition requires us to specify both arguments of `max' in one chunk; instead, we would have to start defining the derived function from scratch. Uncurried definitions also have their merits. In particular, they allow us to define "variadic" functions, i.e., functions which can handle a varying number of arguments. For instance, the following definition enables us to apply the `max' function to both pairs and triples: max (X,Y) = X if X>=Y; = Y otherwise; max (X,Y,Z) = max (X,max (Y,Z)) In fact, in the Q language it is possible to define generic functions which apply to any number of arguments specified as a tuple. For instance, the following version of `max' handles tuples of any size at least 2: max (X,Y) = X if X>=Y; = Y otherwise; max (X,Y|Zs) = max (X,max (Y|Zs)); (In the above example, the notation `(...|Zs)' is used to denote a tuple with "tail" `Zs'. This is explained in the following section.)  File: qdoc.info, Node: Lists Streams and Tuples, Next: Built-In Operators, Prev: Applications, Up: Expressions 6.3 Lists, Streams and Tuples ============================= The Q language provides three closely related general constructs for representing sequences of objects: lists, streams and tuples. We start out with a discussion of the "list" data structure. The constant `[]' denotes the empty list. In general, a list consisting of elements `X1', `X2', ..., `Xn' is denoted `[X1,X2,...,Xn]'. For instance, `[a,b,c]' consists of three elements (symbols) `a', `b' and `c'. In contrast to statically typed languages like Haskell and ML, list members may have different types; e.g., `[a,10,"a string"]' is a perfectly well-formed list consisting of a symbol, a number and a string. It is also possible to have nested lists, as in `[a,[b,c]]' which consists of two elements, the symbol `a' and the list `[b,c]'. You can also have a trailing comma in a list, as in `[a,b,c,]'. The trailing comma is ignored, so the above is exactly the same as `[a,b,c]'. This makes it possible to format a list as follows, which lets you add new items at the end more easily: def ITEMS = [ "This is the first item", "This is the second item", ... "This is the last item", ]; As in Prolog, lists are actually represented in a right-recursive fashion using a binary constructor `[|]' which takes as its arguments the "head" element of the list and the list of the remaining elements, called the "tail" of the list. Thus, e.g., `[a,b,c]' is simply a shorthand notation for `[a|[b,c]]', which denotes a list with head `a' and tail `[b,c]' (which is in turn a shorthand for `[b|[c]]', etc.). You can also mix both styles of notation; for instance, `[a,b|[c,d]]' is another way to represent the 4-element list `[a,b,c,d]'. Note that `[a|[b,c]]' is different from `[a,[b,c]]': the former denotes a three-element list, while the latter is a two-element list whose second member happens to be a list itself. Also note that the `[|]' constructor can in fact be applied to any pair of values (the second value does not have to be a list); e.g., `[a|b]' is a perfectly well-formed expression (although the built-in length, indexing and concatenation operations described in *Note String List and Tuple Operators::, will fail on such values). Q also provides so-called "enumerations" to construct lists from a range of values. These usually take the form `[X1,X2,...,Xn..Y]' which is just syntactic sugar for the application `enum [X1,X2,...,Xn] Y' involving the built-in `enum' function, see *Note Enumeration Types::, for details. Syntactic sugar is also provided for "list comprehensions", which are discussed in *Note Conditionals and Comprehensions::. Analogous constructs are also provided for streams and tuples. "Streams" look exactly like lists except that they are written with curly braces. The difference between lists and streams is that the head and tail of a stream are not evaluated until their values are actually required during the course of a computation. For instance, you will find that if you enter an expression like `{1+1,2+2,3+3}' in the interpreter, then the value of that expression will be just the expression itself, because the embedded applications of the `+' operator are not evaluated: ==> {1+1,2+2,3+3} {1+1,2+2,3+3} In contrast, if you enter the same sequence as a list, its elements will be evaluated immediately: ==> [1+1,2+2,3+3] [2,4,6] A stream element will be evaluated automatically when it is accessed, e.g., using the index operator: ==> {1+1,2+2,3+3}!1 4 This is also known as "lazy" or "non-strict evaluation": the evaluation of the stream elements is deferred until they are actually needed. In fact, the entire tail of a stream may be deferred, which makes it possible to create infinite sequences whose members are produced "on the fly" while traversing the sequence. Streams are implemented in terms of so-called "special forms" which will be introduced in *Note Special Forms::. We therefore postpone a closer description of this data structure until *Note Special Constructors and Streams::. Also note that the Q language itself only defines the stream constructors; most other stream functions are actually implemented by the standard library script `stream.q', see *Note Streams::. "Tuples" also work in much the same fashion as lists: The constant `()' denotes the empty tuple, and a tuple consisting of elements `X1', `X2', ..., `Xn' is written as `(X1,X2,...,Xn)', which is equivalent to `(X1|(X2|...|(Xn|())))', where the notation `(X|Xs)' denotes a tuple consisting of a first element `X' and the tuple `Xs' of the remaining elements. A trailing comma may follow the last tuple element, so that `(a,b,c,)' is the same as `(a,b,c)'. The trailing comma is optional in all cases except one: As in the Python programming language, 1-tuples are indicated with a trailing comma, to distinguish them from ordinary parenthesized expressions. Hence `(a,)' denotes the 1-tuple with the single member `a', while `(a)' just denotes `a' itself. Note that in contrast to languages like ML and Haskell, tuples are not actually needed to represent sequences of values with different types, as Q's list data structure already allows you to do that. Also, Q's tuples are a lot more flexible than in ML and Haskell, as they can be handled mostly like list values. The big difference between lists and tuples in the Q language is that, while lists are always represented as recursive data objects using a binary constructor symbol (just the way that they are written), tuples are actually implemented as "vectors" which are stored as contiguous sequences in memory. (This only works for "well-formed" tuples. If the tail `Xs' of a tuple `(X|Xs)' is not a tuple, then this value can only be represented using an explicit application of the tuple constructor.) Therefore tuples normally use up much less memory than lists of the same size, and they also allow constant-time access to their members. The size of a tuple can be determined in constant time as well. In contrast, the same operations, when applied to a list, require time proportional to the size of the list. On the other hand, lists are more efficient when accessing the tail of a list and when a new element is prepended to a list, which can both be done in constant time. Here a tuple needs time proportional to its size, since the member sequence of the original tuple must be copied when accessing its tail or when prepending an element. These tradeoffs should be carefully considered when deciding whether to implement a given sequence as a list or a tuple. Tuples are usually the best choice for implementing fixed sequences requiring fast random access to its individual members, whereas lists provide the most efficient representation if the elements are traversed and manipulated sequentially. If necessary, you can easily convert between these two representations using the built-in functions `list' and `tuple', see *Note Conversion Functions::. Lists, streams and tuples are frequently combined to create nested structures like a tuple of lists and/or streams, a list (or stream) of lists, or a list (or stream, or tuple) of tuples. The latter construct (sequences of tuples) is so common that Q provides special support for it in the form of the following "grouping" syntax. The notation `[X1, X2, ...; Y1, Y2, ...; ...]' is a shorthand for `[(X1, X2, ...), (Y1, Y2, ...), ...]'. Note that there must be at least one element in each group (but the number of elements in different groups may vary), and there must be at least one group (a single group is denoted with a trailing `;'). Thus, for instance, `[A,B;C,D;X,Y,Z]' is the same as `[(A,B),(C,D),(X,Y,Z)]', and `[X,Y,Z;]' is the same as `[(X,Y,Z)]'. The same notation also applies to streams and tuples.  File: qdoc.info, Node: Built-In Operators, Next: User-Defined Operators, Prev: Lists Streams and Tuples, Up: Expressions 6.4 Built-In Operators ====================== Besides the forms of expressions already discussed above, the Q language also provides a collection of prefix, infix and mixfix operator symbols for common arithmetic, logical, relational and other operations. A complete list of these symbols is given below, in order of decreasing precedence: `' ` ~ &' quotation operators (unary) `.' function composition `^ !' exponentiation and subscript `- # not' prefix operators (unary) `* / div mod and and-then' multiplication operators `++ + - or or-else' addition operators `< > = <= >= <> ==' relational operators `$' infix application operator `if-then-else' conditional expressions `||' sequence operator `\X.Y' lambda abstractions Most of these symbols have their usual meaning; a closer description follows below. All binary operators are left-associative, with the exception of `^', `!' and `$' which associate to the right, and the relational operators which are nonassociative. Application takes precedence over all these operations except the quotation operators; hence `sqrt X^3' denotes `(sqrt X)^3' and _not_ `sqrt (X^3)'. The quotation operators have the highest possible precedence, and hence bind stronger than even applications. Parentheses are used to group expressions and overwrite default precedences and associativity as usual. C programmers will also note that the logical operators have the same "wrong" precedence as in Pascal. Thus you should make sure that you always parenthesize relational expressions when combining them with logical connectives. You should also note that unary minus _must_ be parenthesized when appearing in an argument of a function application. For instance, although `foo X -Y' is a perfectly well-formed expression, it denotes `(foo X) - Y' and _not_ `foo X (-Y)' as you might have intended by the spacing which is however ignored by the Q compiler. As already noted in *Note Lexical Matters::, unary minus in front of a number is special; it causes values such as `-2' to be interpreted as negative numbers rather than denoting an explicit application of the unary minus operator (an explicit application of unary minus can be denoted using the built-in `minus' symbol; see below). The rules for parenthesizing negative numbers are the same as those for unary minus; you have to write `foo X (-2)' instead of `foo X -2' (which denotes `(foo X) - 2'). In the Q language, expressions involving prefix and infix operators are merely syntactic sugar for applications. By enclosing such an operator in parentheses, you can turn it into an ordinary prefix function. For instance, `X+Y' is exactly the same as `(+) X Y', and `not X' actually denotes the application `(not) X'. The built-in operators simply provide a more convenient way for applying these operations to their arguments, which resembles common mathematical notation. Moreover, you can also turn a binary operator into a unary function by specifying either the left or the right operand. E.g., `(1/)' denotes the reciprocal and `(*2)' the doubling function. Such constructs are commonly called "operator sections". Inside a section, the usual precedence and associativity rules apply. For instance, the `X+3' subterm in `(*(X+3))' _must_ be parenthesized because `*' has a higher precedence than `+', and hence the partial expression `(*X+3)' is not well-formed. The `-' operator plays a somewhat awkward role in the syntax, since it is used to denote both unary and binary minus. Q adopts the convention that the notation `(-)' always denotes _binary_ minus; unary minus may be denoted by the built-in `minus' function. Thus the expression `minus X' applies unary minus to `X'. Note that this construct always denotes an explicit application of the unary minus operation. For instance, `minus 5' denotes the application of unary minus to the integer `5', while `-5' is a negative integer. Also note that the construct `(-X)' is _not_ an operator section, but a parenthesized expression involving unary minus. The easiest way to construct an operator section which subtracts a given value from its argument is to formulate the function using the addition operator as in `(+(-X))'. Another special case are the mixfix operators `if-then-else' (conditional expression) and `\X.Y' (lambda abstraction). These are just syntactic sugar for the standard library functions `ifelse'/`when' and the built-in function `lambda', so no special syntax is needed to turn them into prefix functions. * Menu: * Quotation Operators:: * Function Composition:: * Arithmetic Operators:: * Relational Operators:: * Logical and Bit Operators:: * String List and Tuple Operators:: * Application and Sequence Operators:: * Conditional Expressions and Lambdas::  File: qdoc.info, Node: Quotation Operators, Next: Function Composition, Prev: Built-In Operators, Up: Built-In Operators 6.4.1 Quotation Operators ------------------------- The `'' (quote), ``' (backquote), `~' (tilde) and `&' (ampersand) operators are used to deal with so-called "special forms". The quote operator quotes an expression as a literal; it is a constructor symbol and hence becomes part of the quoted expression. The backquote and tilde operators are used to "splice" and "force" subterms in an expression. The ampersand operator causes "special arguments" to be memoized. We postpone a discussion of these operations until *Note Special Forms::.  File: qdoc.info, Node: Function Composition, Next: Arithmetic Operators, Prev: Quotation Operators, Up: Built-In Operators 6.4.2 Function Composition -------------------------- The `.' operator denotes function composition: `(F.G) X = F (G X)'. For instance, to double a value and then add 1 to it, you can use an expression like `((+1).(*2)) X'. Note the parentheses around the composition - as application binds stronger than composition, `F.G X' is interpreted as `F.(G X)' rather than `(F.G) X'. If a composition involves applications with integer arguments, extra whitespace may be needed to distinguish between the `.' operator and the decimal point; e.g., `F 5 . G' denotes a composition of the functions `F 5' and `G', whereas `F 5.G' is an application of the function `F' to two arguments, the floating point value `5.' and the function `G'. Also note that the composition operator is associative, since `(F.G.H) X = ((F.G).H) X = F (G (H X)) = (F.(G.H)) X'.  File: qdoc.info, Node: Arithmetic Operators, Next: Relational Operators, Prev: Function Composition, Up: Built-In Operators 6.4.3 Arithmetic Operators -------------------------- The operators `+', `-', `*', `/' denote the sum, the difference, the product and the quotient of two numeric values, respectively. As already noted, `-' is also used as unary minus. The operators `div' and `mod' denote integer quotient and modulo, respectively; they only apply to integers. The `^' operator denotes exponentiation, as defined by `X^Y = exp (ln X*Y)'; it always returns a floating point value. (If `X<0' then `X^Y' is defined only if `Y' is an integer. `0^0' is left undefined as well, so if you would like to have that `0^0 = 1' then you must add corresponding equations yourself. Also note that the `complex.q' standard library module extends the built-in definition of the exponentiation operator to handle the case that `X<0' with general exponent; see *Note Complex Numbers::.) The argument and corresponding result types of these operations are summarized in the following table (`Int' denotes integer, `Float' floating point, and `Real' integer or floating point values): `+ - *' `Int' `Int' -> `Int' `Int' `Float' -> `Float' `Float' `Int' -> `Float' `Float' `Float' -> `Float' `/ ^' `Real' `Real' -> `Float' `div mod' `Int' `Int' -> `Int' `- (unary)' `Int' -> `Int' `Float' -> `Float' Note that, as of Q 7.2, all floating point operations properly deal with IEEE floating point infinities and NaN ("not a number") values. Consequently, division by zero now yields one of the special values `inf', `-inf' or `nan', depending on the value of the first operand. (Of course, this will only work as described on systems with a proper IEEE floating point implementation. The interpreter makes no attempt to emulate IEEE floating point values on systems which do not provide them natively.)  File: qdoc.info, Node: Relational Operators, Next: Logical and Bit Operators, Prev: Arithmetic Operators, Up: Built-In Operators 6.4.4 Relational Operators -------------------------- The operators `<', `>', `=', `<=', `>=', `<>' are binary predicates meaning "less", "greater", "equal", "less or equal", "greater or equal" and "not equal", respectively. The built-in definition of these operations only applies to numbers, strings and truth values. All relational operators return truth values (`true', `false') as results. Strings are compared lexicographically, on the basis of the character encoding (which, as of Q 7.0, usually is Unicode). Truth values are ordered by `false < true'. If you would like to compare other types of values than the basic objects mentioned above, normally you will have to provide suitable definitions yourself. For instance, you might wish to extend the equality operation to other built-in and user-defined data structures such as lists, trees, etc., by "overloading" the `=' operator accordingly. The following equations implement an equality predicate on lists (the parentheses on the left-hand side are necessary to prevent the equality operation to be interpreted as the equal sign separating left-hand and right-hand side of an equation): ([] = []) = true; ([] = [Y|Ys]) = false; ([X|Xs] = []) = false; ([X|Xs] = [Y|Ys]) = (X=Y) and then (Xs=Ys); Rules for other comparison operators (`<>', `<=', etc.) could be added in the same fashion. Actually, the standard library provides corresponding definitions; see *Note The Standard Library::. A special case arises with types consisting of only nullary constructor symbols declared with `const', so-called "enumeration types", see *Note Types::. The values of such a type can always be compared with the relational operators, using the order in which they are listed in the type declaration. (A special case of this is the order of the built-in truth values.) For instance, assume that the `Day' type is declared as follows: type Day = const sun, mon, tue, wed, thu, fri, sat; Then the listed constants will be ordered as `sun < mon < ... < sat'. Besides the equality predicate, the Q language also provides a notion of "syntactic" equality, implemented by the `==' operator, which applies to all kinds of expressions; see *Note Non-Linear Equations::, for details.  File: qdoc.info, Node: Logical and Bit Operators, Next: String List and Tuple Operators, Prev: Relational Operators, Up: Built-In Operators 6.4.5 Logical and Bit Operators ------------------------------- The logical operations `not', `and', `or' denote logical negation, conjunction and disjunction, respectively. These operators take truth values as their arguments. They are defined in a straightforward manner: not true = false; not false = true; true and true = true; true and false = false; false and true = false; false and false = false; true or true = true; true or false = true; false or true = true; false or false = false; Like most other programming languages, Q also has logical connectives for the "short-circuit evaluation" of logical expressions, which are denoted by the operators `and then' and `or else'. These operations are actually implemented as "special forms" which evaluate their arguments from left to right only as far as required to determine the value of the expression (cf. *Note Special Forms::). They are defined by the following built-in equations: true and then X = X; false and then X = false; false or else X = X; true or else X = true; The first operand is always evaluated. Depending on its value, the second operand may not be evaluated at all. For instance, if `X' evaluates to `false', then `X and then Y' immediately reduces to `false', without ever having to evaluate the second argument `Y'. On the other hand, if `X' evaluates to `true', then `Y' is evaluated and returned as the value of the expression. The `or else' operation works analogously. One reason for using short-circuit evaluation is efficiency: prevent unnecessary computations when an initial part of a logical expression already determines the value of the entire expression. Furthermore, short-circuit evaluation is sometimes essential to check a condition before evaluating an expression depending on the validity of this condition. For instance: (X <> 0) and then (foo (1/X) > 0) You should also note that, according to the above definitions, `X and then Y' and `X or else Y' are _always_ reduced if `X' is a truth value, even if the second operand `Y' does _not_ evaluate to a truth value. This may look a bit weird at first, but it is in fact the most reasonable way to implement short-circuit evaluation in the Q language, since Q uses dynamic typing, and hence the type of an expression is only known _after_ it has been evaluated. In fact, this "feature" can be quite useful at times. For instance, you can also use `and then' to write down a simple kind of conditional expression like check X and then bar X where `bar X' is the value you wish to compute, but only if the condition `check X' holds. The Q interpreter also uses the operators `not', `and' and `or' to denote bitwise logical operations on integers. Thus, `not X' denotes the one's complement of an integer `X', and `X and Y' and `X or Y' the bitwise logical conjunction and disjunction of integers `X' and `Y', respectively. These operations behave as if negative integers were represented in two's complement (although GMP actually uses a sign-magnitude representation). This means that for each integer `X', `-X = not X + 1', or, equivalently, `not X = -X-1'. For instance: ==> 17 and not 13 16 ==> 17 or not 13 -13 ==> not _ 12 These results become clear when considering that `17' has bits 0 (=1) and 4 (=16) on (and all other bits off), while `not 13' has bits 0 (=1), 2 (=4) and 3 (=8) off (and all other bits on). Thus, writing these numbers as unsigned bit sequences with most significant bits first, where `...1' denotes an infinite sequence of leading 1's, we have: 17 and not 13 = 10001 and not 1101 = 10001 and ...10010 = 10000 = 16 17 or not 13 = 10001 or not 1101 = 10001 or ...10010 = ...10011 (= not 1100 = not 12) = -13 Together with the built-in bit shift operations, the bitwise logical operations can also be used to implement "bitsets", i.e., sets of nonnegative integers represented by bit patterns. See *Note Arithmetic Functions::, for details. In case you are wondering about "exclusive or:" While this operator is not provided as a builtin, you can easily define it yourself as follows: public (xor) X Y @ 3; X xor Y = (X or Y) and not (X and Y);  File: qdoc.info, Node: String List and Tuple Operators, Next: Application and Sequence Operators, Prev: Logical and Bit Operators, Up: Built-In Operators 6.4.6 String, List and Tuple Operators -------------------------------------- The `++' operator denotes concatenation, `#' is the length or size operator, and the subscript operator `!' is used for indexing. These operations work consistently on strings, lists and tuples. For instance: "abc"++"xy" => "abcxy" [a,b,c]++[x,y] => [a,b,c,x,y] (a,b,c)++(x,y) => (a,b,c,x,y) #"abc" => 3 #[a,b,c] => 3 #(a,b,c) => 3 "abc"!1 => "b" [a,b,c]!1 => b (a,b,c)!1 => b Indexing with the subscript operator starts at zero, so that `X!0' and `X!(#X-1)' denote the first and last member of a string, list or tuple, respectively. Also note that since `!' is right-associative, double subscripts may have to be parenthesized. For instance, compare `(X!I)!J' and `X!I!J'=`X!(I!J)'. All list and tuple operators check that their first argument is a well-formed list or tuple value. However, the second argument of the `++' operator may in fact be any value. List concatenation just replaces the empty sublist marking the end of its first list argument with the second argument, as if it was defined by the following equations: []++Ys = Ys; [X|Xs]++Ys = [X|Xs++Ys]; Hence we have that, e.g., `[]++1 => 1' and `[1,2]++3 => [1,2|3]', which may look odd, but is consistent with the above equations. Tuple concatenation works in an analogous manner. All operators discussed in this section are also available for streams (cf. *Note Lists Streams and Tuples::). However, the corresponding definitions are not built into the language but are provided in the standard library script `stream.q', see *Note Streams::.  File: qdoc.info, Node: Application and Sequence Operators, Next: Conditional Expressions and Lambdas, Prev: String List and Tuple Operators, Up: Built-In Operators 6.4.7 Application and Sequence Operators ---------------------------------------- The application operator `$' and the sequence operator `||' have the lowest precedence among the prefix and infix operators. The `$' operator is right-associative and binds stronger than the `||' operator, which is left-associative. The infix application operator provides an alternative way to denote function application, i.e., `F $ X => F X'. This operator comes in handy when you want to pass around function application as a function parameter. Since this operator has very low precedence and is right-associative, it also provides a convenient way to write down "cascading function applications," such as `foo $ bar $ X+Y = foo (bar (X+Y))', which can save a lot of parentheses and make an expression more readable. The sequence operator lets you evaluate sequences of expressions in a given order. The value of the sequence is given by the rightmost expression. That is, X || Y => Y. The sole purpose of this construct is to allow operations with side-effects (such as I/O functions) to be carried out sequentially. A typical example is writes "Input: " || reads which prints a prompt on the terminal and then reads in a string. (The built-in functions `writes' and `reads' are described in *Note Built-In Functions::.)  File: qdoc.info, Node: Conditional Expressions and Lambdas, Prev: Application and Sequence Operators, Up: Built-In Operators 6.4.8 Conditional Expressions and Lambdas ----------------------------------------- As of version 7.1, Q also provides special syntax for two other kinds of constructs which are commonly encountered in functional programs: conditional expressions and lambda abstractions. A "conditional expression" takes the form `if X then Y else Z' where `X', `Y' and `Z' are arbitrary expressions. It returns the value of `Y' if the value of `X' is `true', and the value of `Z' if the value of `X' is `false'. The `else' part can also be omitted, in which case `()' is returned if the value of `X' is `false'. Like the built-in logical connectives `and then' and `or else', conditional expressions are special forms which are evaluated in short-circuit mode; thus, if `X' evaluates to `false', then `Y' will never be evaluated. "Lambda abstractions" can be denoted using the customary notation `\X.Y' where `X' is an expression denoting a pattern to be matched against the argument of the lambda function, and `Y' is another expression, the lambda body which is to be evaluated when the lambda function is applied to an argument. Note that if th