\input texinfo @c -*-texinfo-*- @comment %**start of header @setfilename qdoc.info @settitle The Q Programming Language @setchapternewpage odd @c SMALL BOOK version @c This edition has been formatted so that you can format and print it in @c the smallbook format. @c smallbook @include version.texi @syncodeindex fn cp @syncodeindex vr cp @syncodeindex ky cp @syncodeindex pg cp @syncodeindex tp cp @footnotestyle end @comment %**end of header @dircategory Programming Languages @direntry * Q: (qdoc). The Q programming language and system. @end direntry @c Experiment with smaller amounts of whitespace between chapters @c and sections. @tex \global\chapheadingskip = 15pt plus 4pt minus 2pt \global\secheadingskip = 12pt plus 3pt minus 2pt \global\subsecheadingskip = 9pt plus 2pt minus 2pt @end tex @c Experiment with smaller amounts of whitespace between paragraphs in @c the 8.5 by 11 inch format. @ifclear smallbook @tex \global\parskip 6pt plus 1pt @end tex @end ifclear @finalout @ifinfo The Q Programming Language, Version @value{VERSION}, @value{UPDATED}. Copyright (c) 1992-2008 by Albert Graef. This manual has been published in the series @cite{Musikinformatik und Medientechnik}, Bereich Musikinformatik, Musikwissenschaftliches Institut, Johannes Gutenberg-Universitaet Mainz @noindent 55099 Mainz@* Germany@* ISSN 0941-0309 The Q programming system is distributed under the terms of the GNU General Public License. See the software license accompanying the distribution for details. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. @ignore Permission is granted to process this file through TeX and print the results, provided the printed document carries copying permission notice identical to this one except for the removal of this paragraph (this paragraph not being relevant to the printed manual). @end ignore Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the author. @end ifinfo @titlepage @title The Q Programming Language @subtitle Version @value{VERSION} @subtitle @value{UPDATED} @author by Albert Gr@"af @author Johannes Gutenberg-University Mainz @page @vskip 0pt plus 1filll Copyright @copyright{} 1992-2008 by Albert Gr@"af This document describes version @value{VERSION} of the Q programming language and system. This manual has been published in the series @cite{Musikinformatik und Medientechnik}, Be@-reich Musikinformatik, Musikwissenschaftliches Institut, Johannes Gutenberg-Universit@"at Mainz @noindent 55099 Mainz@* Germany@* ISSN 0941-0309 The Q programming system is distributed under the terms of the GNU General Public License. See the software license accompanying the distribution for details. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the author. @end titlepage @page @node Top, Introduction, (DIR), (DIR) @ifinfo This document describes the Q programming language and system, version @value{VERSION}, @value{UPDATED}. Written by Albert Graef, University of Mainz. @end ifinfo @menu * Introduction:: * Getting Started:: * Lexical Matters:: * Scripts and Modules:: * Declarations:: * Expressions:: * Equations and Expression Evaluation:: * Types:: * Special Forms:: * Built-In Functions:: * The Standard Library:: * Clib:: Appendix: * Q Language Grammar:: * Using Q:: * C Language Interface:: * Debugging:: * Running Scripts in Emacs:: * References:: * Index:: @end menu @node Introduction, Getting Started, Top, Top @chapter Introduction Q stands for ``equational'', so Q, in a nutshell, is a programming language which lets you ``program by equations''. You specify a system of equations which the interpreter uses as ``rewrite rules'' to reduce expressions to ``normal form''. This allows you to formulate your programs in a high-level, concise and declarative style. Q's development started out in the 1990s, motivated by the author's research on term pattern matching @iftex [Gr@"af 1991] @end iftex @ifinfo [Graef 1991] @end ifinfo and inspired by the pioneering work on equational programming by Michael O'Donnell and others [O'Donnell 1985], as an attempt to show that term rewriting would provide a feasible basis for a practical programming language. I think that this goal has been achieved, and that the present interpreter is efficient and robust enough for practical purposes. Q has been ported to a variety of operating systems, such as BeOS, FreeBSD, Linux, MacOS X, Solaris and Windows. Porting to other modern UNIX/POSIX environments should be a piece of cake. Thus Q can be used on most modern computer systems, and Q scripts written on one system will usually run on all supported platforms. The Q language supports a rich variety of built-in types, like arbitrary precision integers, floating point numbers (double precision 64 bit), truth values, strings, tuples, lists, streams (a ``lazy'' list data structure with call-by-need evaluation), lambda functions and files. It also provides primitives for exception handling and multithreaded execution. Q scripts can be broken down into ``modules'', each with their separate namespace, and Q gives you full control over which symbols (function, variable and type names) are exported by each module. This makes it easy to organize large scripts in a modular fashion. Q also allows you to interface to ``external'' modules written in the C programming language, which provides a means to access functions in C and C++ libraries and employ C's higher processing speed for time-critical tasks. Conversely, Q scripts can also be executed from C and C++, which allows Q to be used as an embedded language or term rewriting engine in C/C++ applications. As a practical programming language, Q comes with ``batteries included''. The standard library, which is mostly written in Q itself, provides rational and complex numbers, a lot of useful tuple, list and stream processing functions (including comprehensions), some common container data structures (dictionaries, sets, etc.), and an interface to the PostScript language. It also includes an extensive system interface which offers services such as binary and C-style formatted I/O, BSD socket I/O, process management, POSIX threads, regular expression matching and internationalization features. Additional extension modules provide interfaces to a number of other third party libraries, which turns Q into a practical tool for a variety of application areas. In difference to other functional languages, Q is entirely based on the notions of @dfn{rewrite rules}, @dfn{reductions} and @dfn{irreducible expressions} (also known as @dfn{normal forms}) pertaining to the @dfn{term rewriting} calculus. A Q ``program'', called a @dfn{script}, consists of equations which are treated as rewrite rules and are used to reduce expressions to normal form. The normal form of an expression denotes its value, which can itself be a compound expression. Q has no rigid distinction between ``constructor'' and ``defined'' function symbols and it also allows you to evaluate expressions containing ``free'' variables. Basically, both sides of an equation may involve arbitrary expressions. Therefore Q can also be used as a tool for symbolic expression evaluation. On the surface, Q looks very much like contemporary functional languages such as Miranda or Haskell. In fact, the syntax of the language has largely been inspired by the first edition of @cite{Introduction to Functional Programming} by Richard Bird and Philip Wadler. However, Q is an interpreted language with dynamic typing and eager (leftmost-innermost) evaluation, which is more in line with classical functional languages such as Lisp. For the sake of efficiency, Q scripts are first translated into ``bytecode'' (an intermediate binary format) which is executed on a virtual stack machine. The interpreter automatically optimizes tail recursion, such that ``iterative'' algorithms can be realized in constant stack space. Besides the built-in I/O operations, the language is free of side-effects; in particular, Q itself does not have mutable variables (although the standard library provides such imperative features for those who need them). Q also provides two novel and (IMHO) interesting features: a notion of @dfn{special forms} which allows to handle both macro-like functions and lazy evaluation in a uniform setting without having to give up the basic eager evaluation strategy; and a notion of @dfn{type guards} which provides a means to cope with hierarchies of abstract data types (similar to the notion of classes with single inheritance in object-oriented languages) in the context of a term rewriting language. Using Q is supposed to be fairly simple: you throw together some equations, start the interpreter and then type in the expressions you wish to evaluate. All this can be done with a few keystrokes, if you use the Emacs Q mode supplied with the package. A graphical user interface for Windows is also available. Of course, you can also run the Q programming utilities from the command line if you prefer that. The manual is organized as follows. In @ref{Getting Started}, we start out with a brief and informal introduction to the main features of the language, and a glimpse of how Q scripts look like. @ref{Lexical Matters}, describes the lexical part of the Q language. In @ref{Scripts and Modules}, we discuss how declarations, definitions and equations are put together to form a script, and how to break down larger scripts into a collection of smaller modules which can be managed separately. @ref{Declarations}, discusses how certain entities like types, variables and function symbols can be declared in a Q script. @ref{Expressions}, treats the syntax of Q expressions, and describes the built-in operators provided by the Q language. @ref{Equations and Expression Evaluation}, is about equations and variable definitions, and how they are used in the evaluation process. @ref{Types}, and @ref{Special Forms}, describe the facilities provided by the Q language for dealing with abstract data types and deferred evaluation. @ref{Built-In Functions}, and @ref{The Standard Library}, discuss the built-in and standard library functions of the Q language. @ref{Clib}, describes Q's ``system module'' which provides access to some important functions from the C library. The appendix gives additional information about some aspects of the language and its implementation. @ref{Q Language Grammar}, contains a summary of the Q language syntax in BNF. @ref{Using Q}, provides a description of the programming tools included in the Q programming system, @ref{C Language Interface}, describes Q's interface to the C programming language, and @ref{Debugging}, is a brief introduction to the symbolic debugger built into the Q interpreter. Finally, @ref{Running Scripts in Emacs}, discusses how Q scripts can be edited and run using the Emacs editor. @noindent @b{Acknowledgements} This project has become much further reaching and took a lot longer than I anticipated when I started out working on it, and I wouldn't have been able to keep it going without the patience and support of my beloved wife Evelyn and children Sebastian, Janosch, Yannic and Miriam. Next I have to acknowledge the support and help provided by colleagues at Mainz and Bremen (Germany) and at INRIA Lorraine (France) during the initial phases of this project. In 1992 Dieter Hofbauer invited me to the term rewriting group at INRIA Lorraine to discuss the design of the (then still nascent) Q with his colleagues and friends. During the following years, I also collaborated with Frank Drewes at Bremen (now at Umea, Sweden) who was involved in the initiation of this project, and Klaus Barthelmann at Mainz who provided many comments which helped to improve the early versions of this manual. Frank and Klaus tested various early beta versions of the Q interpreter, and contributed sample scripts as well as substantial parts of the standard Q library. I'd also like to express my gratitude to the growing community of users of the Q language all over the world, in particular the members of the Q mailing list, for the lively discussions which helped to improve the language and its implementation, and for the contributed Q modules and applications. Special thanks are due to John Cowan for proofreading the manual, for his valuable suggestions concerning design and implementation issues and for his Q-Chicken interface, and to Rob Hubbard for his comprehensive rational number and polynomial modules which are now part of the distribution. Last but not least, thanks are also due to the developers of ML and Haskell. While these people have not been involved with Q in any way, their work, which has changed the landscape of functional programming, has been a constant source of inspiration for me. @node Getting Started, Lexical Matters, Introduction, Top @chapter Getting Started A new programming language is best conveyed through some examples, and therefore it is common practice to start a language description with an introductory chapter which treats the most essential language features in a fairly informal manner. This is also the purpose of the present chapter. We first show how some of the ``standard features'' of the Q programming language can be employed for using the Q interpreter effectively as a sophisticated kind of ``desktop calculator''. The remainder of this section then adresses the question of how you can extend the Q environment by providing your own definitions in Q scripts, and describes the evaluation process and the treatment of runtime errors in the interpreter. @menu * Using the Interpreter:: * Using the Standard Library:: * Writing a Script:: * Definitions:: * Runtime Errors:: @end menu @node Using the Interpreter, Using the Standard Library, Getting Started, Getting Started @section Using the Interpreter To begin with, let us show how the Q interpreter is invoked to evaluate some expressions with the built-in functions provided by the Q language. Having installed the Q programming system, you can simply invoke the interpreter from your shell by typing @code{q}. The interpreter will then start up, display its sign-on, and leaves you at its prompt: @smallexample ==> @end smallexample This indicates that the interpreter is waiting for you to type an expression. Let's start with a simple one: @smallexample ==> 23 23 @end smallexample Sure enough, that's correct. Whenever you type an expression, the interpreter just prints its value. Any integer is a legal value in the Q language, and the common arithmetic operations work on these values as expected. Q integers are ``bignums'', i.e., their size is only limited by available memory: @smallexample ==> 16753418726345 * 991726534256718265234 16614809890429729930396098173389730 @end smallexample ``Real'' numbers (more exactly, their approximation using 64 bit double precision floating point numbers) are provided as well. Let's try these: @smallexample ==> sqrt (16.3805*5)/.05 181.0 @end smallexample The @code{sqrt} identifier denotes a built-in function which computes the square root of its argument. (A summary of the built-in functions can be found in @ref{Built-In Functions}.) What happens if you mistype an expression? @smallexample ==> sqrt (16.3805*5)/,05 ! Syntax error >>> sqrt (16.3805*5)/,05 ^ @end smallexample As you can see, the interpreter not only lets you know that you typed something wrong, but also indicates the position of the error. It is quite easy to go back now and correct the error, using the interpreter's ``command history''. With the up and down arrow keys you can cycle through the expressions you have typed before, edit them, and resubmit a line by hitting the carriage return key. Also note that what you typed is stored in a ``history file'' when you exit the interpreter, which is reloaded next time the interpreter is invoked. A number of other useful keyboard commands are provided, see @ref{Command Line Editing, Readline, Command Line Editing, readline, The GNU Readline Library}. In particular, you can have the command line editor ``complete'' function symbols with the @code{} key. E.g., if you type in @code{sq} and press the @code{} key, you will get the @code{sqrt} function. Before we proceed, a few remarks about the syntax of function applications are in order. The first thing you should note is that in Q, like in most other modern functional languages, function application is simply denoted by juxtaposition: @smallexample ==> sqrt 2 1.4142135623731 @end smallexample Multiple arguments are specified in the same fashion: @smallexample ==> max 5 7 7 @end smallexample Parentheses can be used for grouping expressions as usual. In particular, if a function application appears as an argument in another function application then it must be parenthesized as follows: @smallexample ==> sqrt (sqrt 2) 1.18920711500272 @end smallexample Another important point is that operators are in fact just functions in disguise. You can turn any operator into a prefix function by enclosing it in parentheses. Thus @code{(+)} denotes the function which adds its two arguments, and @code{X+1} can also be written as @code{(+) X 1}; in fact, the former expression is nothing but ``syntactic sugar'' for the latter, see @ref{Built-In Operators}. You can easily verify this in the interpreter: @smallexample ==> (+) X 1 X+1 @end smallexample You can also have partial function applications like @code{(*) 2} which denotes a function which doubles its argument. Moreover, Q supports so-called @dfn{operator sections} which allow you to specify a binary operator with only either its left or right operand. For instance, @code{(1/)} denotes the reciprocal and @code{(+1)} the ``increment by 1'' function: @smallexample ==> (+1) 5 6 ==> (1/) 3 0.333333333333333 @end smallexample The interpreter maintains a global variable environment, in which you can store arbitrary expression values. This provides a convenient means to define abbreviations for frequently-used expressions and for storing intermediate results. Variable definitions are done using the @code{def} command. For instance: @smallexample ==> def X = 16.3805*5 ==> X 81.9025 @end smallexample As indicated, variable symbols usually start with a capital letter; an initial lowercase letter indicates a function symbol. This convention is valid throughout the Q language. However, it is possible to explicitly declare an uncapitalized symbol as a variable, using the @code{var} command, and then assign values to it, like so: @smallexample ==> var f ==> def f = sqrt @end smallexample You can also both declare a variable and initialize its value with a single @code{var} command: @smallexample ==> var f = sqrt ==> f X/.05 181.0 @end smallexample Multiple variable declarations and definitions can be entered on a single line, using commas to separate different definitions in a @code{def} or @code{var} command, and you can separate multiple expressions and commands on the same line with semicolons: @smallexample ==> def X = 16.3805*5, f = sqrt; X; f X/.05 81.9025 181.0 @end smallexample You can also bind several variables at once by using an expression @dfn{pattern} as the left-hand side of a variable definition: @smallexample ==> def (f, X) = (sqrt, 16.3805*5); f X/.05 181.0 @end smallexample (Such pattern-matching definitions only work with @code{def}; the left-hand side of a @code{var} declaration must always be a simple identifier.) Another useful feature is the built-in ``anonymous'' variable @samp{_}, which is always set to the value of the most recent expression value printed by the interpreter: @smallexample ==> _ 181.0 ==> 2*_ 362.0 @end smallexample Sometimes you would also like the interpreter to ``forget'' about a definition. This can be done by means of an @code{undef} statement: @smallexample ==> undef X; X X @end smallexample Besides @code{def}, @code{undef} and @code{var}, the interpreter provides a number of other special commands. The most important command for beginners certainly is the @code{help} command, which displays the online manual using the GNU info reader. You can also run this command with a keyword to be searched in the info file. For instance, to find out about all special commands provided by the interpreter, type the following: @smallexample ==> help commands @end smallexample @noindent (Type @code{q} when you are done reading the info file.) Other useful commands are @code{who} which prints a list of the user-defined variables, and @code{whos} which describes the attributes of a symbol: @smallexample ==> who f ==> whos f sqrt f user-defined variable symbol = sqrt sqrt builtin function symbol sqrt X1 @end smallexample You can save user-defined variables and reload them using the @code{save} and @code{load} commands: @smallexample ==> save saving .q_vars ==> def f = 0; f 0 ==> load loading .q_vars ==> f sqrt @end smallexample Some statistics about the most recent expression evaluation can be printed with the @code{stats} command: @smallexample ==> sum [1..123456] 7620753696 ==> stats 0.52 secs, 246916 reductions, 246918 cells @end smallexample The @code{stats} command displays the (cpu) time needed to complete an evaluation, the number of ``reductions'' (a.k.a. basic evaluation steps) performed by the interpreter, and the number of expression ``cells'' needed during the course of the computation. This information is often useful for profiling purposes. Other commands allow you to edit and run a script directly from the interpreter, and to inspect and set various internal parameters of the interpreter; see @ref{Command Language}. Of course, the Q interpreter can also carry out computations on non-numeric data. In fact, Q provides a fairly rich collection of built-in data types, and makes it easy to define your own. For instance, @dfn{strings} are denoted by character sequences enclosed in double quotes. Strings can be concatenated with the @samp{++} operator and the length of a string can be determined with @samp{#}. Moreover, the character at a given position can be retrieved with the (zero-based) indexing operator @samp{!}: @smallexample ==> "abc"++"xyz"; #"abc"; "abc"!1 "abcxyz" 3 "b" @end smallexample The same operations also apply to @dfn{lists}, which are written as sequences of values enclosed in brackets: @smallexample ==> [a,b,c]++[x,y,z]; #[a,b,c]; [a,b,c]!1 [a,b,c,x,y,z] 3 b @end smallexample As in Prolog, lists are actually represented as right-recursive constructs of the form @samp{[X|Xs]} where @code{X} denotes the head element of the list and @code{Xs} its tail, i.e., the list of the remaining elements. This representation allows a list to be traversed and manipulated in an efficient manner. @smallexample ==> def [X|Xs] = [a,b,c]; X; Xs a [b,c] ==> [0|Xs] [0,b,c] @end smallexample Another useful list operation is ``enumeration'', which works with all so-called ``enumeration types'' (cf. @ref{Enumeration Types}), as well as characters and numbers. Enumerations are constructed with the @code{enum} function, or using the familiar @code{[X..Y]} notation which is in fact only ``syntactic sugar'' for @code{enum}: @smallexample ==> enum "a" "k" ["a","b","c","d","e","f","g","h","i","j","k"] ==> ["a".."k"] ["a","b","c","d","e","f","g","h","i","j","k"] ==> [0..9] [0,1,2,3,4,5,6,7,8,9] ==> [0.1,0.2..1.0] [0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1.0] @end smallexample @dfn{Tuples} are sequences of expressions enclosed in parentheses, which are Q's equivalent of ``vectors'' or ``arrays'' in other languages. Like lists, tuples can be concatenated, indexed and enumerated, but they use an internal representation optimized for efficient storage and constant-time indexing: @smallexample ==> (a,b,c)++(x,y,z); #(a,b,c); (a,b,c)!1 (a,b,c,x,y,z) 3 b ==> (0..9) (0,1,2,3,4,5,6,7,8,9) @end smallexample Q also offers built-in operations to convert between lists and tuples: @smallexample ==> tuple [a,b,c] (a,b,c) ==> list _ [a,b,c] @end smallexample Q provides yet another list-like data structure, namely ``lazy'' lists a.k.a. @dfn{streams}, which are written using curly braces instead of brackets. In difference to lists, streams only produce their elements ``on demand'', when they are needed in the course of a computation. This makes it possible to represent large and even infinite sequences of values in an efficient manner. You can see this somewhat unusual behaviour if you try streams interactively in the interpreter. For instance, let's try to concatenate two streams: @smallexample ==> @{a,b,c@}++@{x,y,z@} @{a|@{b,c@}++@{x,y,z@}@} @end smallexample You probably noticed that the tail of the resulting stream, @code{@{b,c@}++@{x,y,z@}}, has not been evaluated yet. But that is all right, because it @emph{will} be evaluated as soon as we need it: @smallexample ==> _!4 y @end smallexample This lazy way of doing things becomes especially important when we work with infinite streams. We will have more to say about this in the following section. It is quite instructive to keep on playing a bit like this, to get a feeling of what the Q interpreter can do. You might also wish to consult @ref{Built-In Functions}, which discusses the built-in functions, and @ref{Using Q}, which shows how the interpreter is invoked and what additional capabilities it offers. To exit the interpreter when you are finished, invoke the built-in @code{quit} function: @smallexample ==> quit @end smallexample This function does not return a value, but immediately returns you to the operating system's command shell. @node Using the Standard Library, Writing a Script, Using the Interpreter, Getting Started @section Using the Standard Library The standard library is a collection of useful functions and data structures which are @emph{not} provided as built-ins, but are implemented by scripts which are mostly written in Q itself. Most of these scripts are included in the ``prelude script'' @code{prelude.q}, which is always loaded by the interpreter, so that the standard library functions are always available. See @ref{The Standard Library}, for an overview of these operations. You can check which standard library scripts a.k.a. ``modules'' are actually loaded with the interpreter's @code{modules} command. With the standard prelude loaded, this command shows the following: @smallexample ==> modules assert clib* complex cond error math prelude rational sort stdlib stream string tuple typec @end smallexample As you can see, there are quite a few modules already loaded, and you can use the functions provided by these modules just like any of the built-in operations. The standard library provides a lot of additional functions operating on numbers, strings and lists. For instance, you can take the sum of a list of (integer or floating point) numbers simply as follows: @smallexample ==> sum [1..5] 15 @end smallexample In fact, the library provides a rather general operation, @code{foldl}, which iterates a binary function over a list, starting from a given initial value. Using the @code{foldl} function, the above example can also be written as follows: @smallexample ==> foldl (+) 0 [1..5] 15 @end smallexample (There also is a @code{foldr} function which works analogously, but combines list members from right to left rather than from left to right.) Generalizing the notion of simple enumerations of numbers like @code{[1..5]}, the standard library also provides the general-purpose list-generating function @code{while}. For instance, we can generate the list of all powers of 2 in the range @code{[1..1000]} as follows: @smallexample ==> while (<=1000) (2*) 1 [1,2,4,8,16,32,64,128,256,512] @end smallexample @noindent (Recall from the previous section that @code{(2*)} is the doubling function. And @code{(<=1000)} is a predicate which checks its argument to be less than or equal to @code{1000}.) If we want to generate a list of a given size, we can use @code{iter} instead. So, for instance, we might compute the value of a finite geometric series as follows: @smallexample ==> sum (iter 4 (/3) 1) 1.48148148148148 @end smallexample The @code{map} function allows you to apply a function to every member of a list. For instance, the following expression doubles each value in the list @code{[1..5]}: @smallexample ==> map (2*) [1..5] [2,4,6,8,10] @end smallexample Lists can also be filtered with a given predicate: @smallexample ==> filter (>=3) [1..5] [3,4,5] @end smallexample The @code{scanl} operation allows you to compute all the sums of initial segments of a list (or accumulate any other binary operation over a list): @smallexample ==> scanl (+) 0 [1..5] [0,1,3,6,10,15] @end smallexample (Like @code{foldl}, @code{scanl} also has a sibling called @code{scanr} which collects results from right to left rather than left to right, starting at the end of the list.) You can partition a list into an initial part with a given length and the remaining part of the list with @code{take} and @code{drop}: @smallexample ==> take 3 [1..5]; drop 3 [1..5] [1,2,3] [4,5] @end smallexample The @code{takewhile} and @code{dropwhile} functions have a similar effect, but partition their input list according to a given predicate: @smallexample ==> takewhile (<=3) [1..5]; dropwhile (<=3) [1..5] [1,2,3] [4,5] @end smallexample Another useful list operation is @code{zip} which collects pairs of corresponding elements in two input lists: @smallexample ==> zip [1..5] ["a".."e"] [(1,"a"),(2,"b"),(3,"c"),(4,"d"),(5,"e")] @end smallexample The effect of @code{zip} can be undone with @code{unzip} which returns a pair of lists: @smallexample ==> unzip _ ([1,2,3,4,5],["a","b","c","d","e"]) @end smallexample The @code{zipwith} function is a generalized version of @code{zip} which combines corresponding members from two lists using a given binary function: @smallexample ==> zipwith (*) [1..5] [1..5] [1,4,9,16,25] @end smallexample All the operations described above -- and many others -- are provided by the @code{stdlib.q} script. It is instructive to take a look at this script and see how the operations are defined. In addition, the standard library provides yet another list generating function @code{listof} which implements so-called ``list comprehensions''. These allow you to describe list values in much the same way as sets are commonly specified in mathematics. For instance, you can generate a list of pairs @code{(I,J)} with @code{1<=J listof (I,J) (I in [1..5], J in [1..I-1]) [(2,1),(3,1),(3,2),(4,1),(4,2),(4,3),(5,1),(5,2),(5,3),(5,4)] @end smallexample The language provides syntactic sugar for list comprehensions so that the above example can also be written as follows: @smallexample ==> [(I,J) : I in [1..5], J in [1..I-1]] [(2,1),(3,1),(3,2),(4,1),(4,2),(4,3),(5,1),(5,2),(5,3),(5,4)] @end smallexample The same kind of construct also works with tuples: @smallexample ==> ((I,J) : I in [1..5], J in [1..I-1]) ((2,1),(3,1),(3,2),(4,1),(4,2),(4,3),(5,1),(5,2),(5,3),(5,4)) @end smallexample We also have a random number generator, which is implemented by the built-in @code{random} function. Here is how we can generate a list of 5 pseudo random 32 bit integers: @smallexample ==> [random : I in [1..5]] [1960913167,1769592841,3443410988,2545648850,536988551] @end smallexample To get random floating point values in the range [0,1] instead, we simply divide the results of @code{random} by @code{0xffffffff}: @smallexample ==> map (/0xffffffff) _ [0.456560674928259,0.41201544027124,0.801731596887515,0.592705060400233, 0.125027389993199] @end smallexample Lists can be sorted using quicksort (this one comes from @code{sort.q}): @smallexample ==> qsort (<) _ [0.125027389993199,0.41201544027124,0.456560674928259,0.592705060400233, 0.801731596887515] @end smallexample As already mentioned, the Q language also provides a ``lazy'' list data structure called ``streams''. Streams are like lists but can actually be infinite because their elements are only produced ``on demand''. The standard library includes a module @code{stream.q} which implements a lot of useful stream operations. Most list operations carry over to streams accordingly. For instance, if we want to create a geometric series like the one generated with @code{iter} above, but we do not know how many elements will be needed in advance, we can employ the stream generation function @code{iterate}: @smallexample ==> iterate (/3) 1 @{1|iterate (/3) ((/3) 1)@} @end smallexample The @code{@{|@}} stream constructor works in the same way as the @code{[|]} list constructor, but is ``special'' in that it does @emph{not} evaluate its arguments, i.e., the head and the tail of the stream. (Otherwise the call to @code{iterate} would never terminate.) To get all members of the series we can apply the @code{scanl} function: @smallexample ==> def S = scanl (+) 0 _; S @{0|scanl (+) (0+1) (iterate (/3) ((/3) 1))@} @end smallexample Now we can extract any number of initial values of the series using the @code{take} operation, and convert the resulting stream to an ordinary list with the @code{list} function. For instance, if we are interested in the first five values of the series, we proceed as follows: @smallexample ==> list (take 5 S) [0,1,1.33333333333333,1.44444444444444,1.48148148148148] @end smallexample Note that the stream @code{S} is really infinite; if we want, we can extract @emph{any} value in the series: @smallexample ==> S!9999 1.5 @end smallexample Let's see how many iterations are actually required to reach the limit @code{1.5} with an error of at most @code{1e-15}: @smallexample ==> #takewhile (>1e-15) (map (1.5-) S) 32 @end smallexample This means that the sum of the first 31 series terms is needed to get an accuracy of 15 digits (which is the best that we can hope for with 64 bit floating point values). We can readily verify this using @code{iter}: @smallexample ==> sum (iter 31 (/3) 1) 1.5 @end smallexample The standard library also implements stream enumerations and comprehensions which work like the corresponding list and tuple constructs but are evaluated in a lazy fashion, and the interpreter provides syntactic sugar for these. In particular, the notation @code{@{X..@}} or @code{@{X,Y..@}} can be used to specify an infinite arithmetic sequence. For instance, here is another way to define the infinite geometric series from above: @smallexample ==> def S = scanl (+) 0 @{1/3^N : N in @{0..@}@} ==> list (take 5 S) [0,1.0,1.33333333333333,1.44444444444444,1.48148148148148] @end smallexample It is worth noting here that streams are just one simple example of a lazy data structure, which happen to play an important role in many useful algorithms so that it makes sense to provide them as a builtin. The Q language makes it rather easy to define your own lazy data structures using so-called @dfn{special forms}. In fact, special forms provide a uniform means to define both lazy data constructors and macro-like functions which take their arguments unevaluated, employing ``call by name'' parameter passing. This will be discussed in detail in @ref{Special Forms}. Special forms are an integrated part of Q which adds a great amount of expressive power to the language. In particular, they allow specialized constructs such as a ``short-circuit'' conditional expressions to be defined in the standard library, just like ``ordinary'' functions, rather than having to provide them as builtins. One example is the conditional expression @code{if X then Y else Z} which is nothing but syntactic sugar for the standard library function @code{ifelse}: @smallexample ==> ifelse (5>0) "positive" "negative" "positive" ==> if 5>0 then "positive" else "negative" "positive" @end smallexample Another special form which is useful in many situations is @code{lambda} (as of Q 7.1, this one is actually provided as a builtin), which allows you to create little ``anonymous'' functions on the fly. The customary notation @code{\X.Y} is provided as a shorthand for @code{lambda X Y}. These constructs are also called @dfn{lambda abstractions}, or simply @dfn{lambdas}. For instance, the following interpreter command defines the well-known factorial function using a lambda abstraction and assigns the resulting function object to the variable @code{fac}: @smallexample ==> var fac = \N.if N>0 then N*fac (N-1) else 1 ==> fac \X1 . if X1>0 then X1*fac (X1-1) else 1 ==> map fac [1..10] [1,2,6,24,120,720,5040,40320,362880,3628800] @end smallexample Lambdas taking multiple arguments can be created like this: @smallexample ==> \X Y.(1-X)*Y \X1 . \X2 . (1-X1)*X2 ==> _ 0.9 0.5 0.05 @end smallexample The lambda arguments can also be patterns to be matched against the actual parameters. For instance: @smallexample ==> \(X,Y).(1-X)*Y \(X1,X2) . (1-X1)*X2 ==> _ (0.9,0.5) 0.05 @end smallexample Besides string and list processing functions and general utility functions of the kind sketched out above, the standard library also includes a collection of efficient ``container'' data structures which are useful in many applications. For instance, so-called @dfn{hashed dictionaries} (also known as ``hashes'' or ``associative arrays'') are implemented by the @code{hdict.q} script. The following expressions show how to create a dictionary object from a list, and how to access this dictionary using the index (@code{!}), @code{keys} and @code{vals} operations. (Note that here we have to load the @code{hdict.q} module explicitly, as it is not automatically included via the prelude. We could also import the @code{stdtypes.q} module instead, to get hold of all the container types.) @smallexample ==> import hdict ==> def D = hdict [(foo,99),(bar,-1),(gnu,foo)]; D!gnu foo ==> keys D [foo,bar,gnu] ==> vals D [99,-1,foo] @end smallexample This completes our little tour through the standard library. To find out more, please refer to @ref{The Standard Library}, or take a look at the scripts themselves. @node Writing a Script, Definitions, Using the Standard Library, Getting Started @section Writing a Script Now that we have taken the first big hurdle and are confident that we can actually get the Q interpreter to evaluate us some expressions, let us try to define a new function for use in the interpreter. That is, we are going to write our first own Q ``program'' or ``script''. In the Q language, there are actually two different ways to create new functions. As we have already seen in the previous sections, little ``anonymous'' functions are often computed on the fly by deriving them from existing functions using partial applications or lambda abstractions. The other way, which is often more convenient when the functions to be defined get more complicated, is the definition of named functions by the means of ``equations'', which works in a way similar to algebraic definitions in mathematics. We will discuss how to do this in the following. Having exited the Q interpreter, invoke your favourite text editor and enter the following simple script which implements the factorial function: @findex fac @cindex factorial function @smallexample fac N = N*fac(N-1) if N>0; = 1 otherwise; @end smallexample (Note that in order to define functions equationally, you really have to write a script file. For technical reasons there currently is no way to enter an equational definition directly in the interpreter.) Save the script in the file @code{fac.q} and then restart the interpreter as follows (here and in the following @samp{$} denotes the command shell prompt): @smallexample $ q fac.q @end smallexample You can also edit the script from within the interpreter (using @code{vi} or the editor named by the @code{EDITOR} environment variable) and then restart the interpreter with the new script, as follows: @smallexample ==> edit fac.q ==> run fac.q @end smallexample In any case, now the interpreter should know about the definition of @code{fac} and we can use it like any of the built-in operations: @smallexample ==> map fac [1..10] [1,2,6,24,120,720,5040,40320,362880,3628800] @end smallexample For instance, what is the number of 30-element subsets of a 100-element set? @smallexample ==> fac 100 div (fac 30*fac 70) 29372339821610944823963760 @end smallexample @cindex Newton's algorithm As another example, let us write down Newton's algorithm for computing roots of a function. Type in the following script and save it to the file @code{newton.q}: @smallexample newton DX DY F = until (satis DY F) (improve DX F); satis DY F X = abs (F X) < DY; improve DX F X = X - F X / derive DX F X; derive DX F X = (F (X+DX) - F X) / DX; @end smallexample Restart the interpreter with the @code{newton.q} script and try the following. Note that the target function here is @code{\Y.Y^3-X} which becomes zero when @code{Y} equals the cube root of @code{X}. @smallexample ==> var eps = .0001, cubrt = \X.newton eps eps (\Y.Y^3-X) X ==> cubrt 8 2.00000000344216 @end smallexample Well, this is not @emph{that} precise, but we can do better: @smallexample ==> def eps = 1e-12 ==> cubrt 8 2.0 @end smallexample @node Definitions, Runtime Errors, Writing a Script, Getting Started @section Definitions As mentioned in the previous section, in the Q language function definitions take the form of equations which specify how a given expression pattern, the left-hand side of the equation, is transformed into a new expression, the right-hand side of the equation. In this section we elaborate on this some more to give you an idea about how these equational definitions work. Our explanations will necessarily be a bit sketchy at this point, but the concepts we briefly touch on in this section will be explained in much more detail later. The left-hand side of an equation may introduce ``variables'' (denoted by capitalized identifiers, as in Prolog) which play the role of formal parameters in conventional programming languages. For instance, the following equation defines a function @code{sqr} which squares its (numeric) argument: @smallexample sqr X = X*X; @end smallexample An equation may also include a condition part, as in the definition of the factorial function from the previous section: @smallexample fac N = N*fac(N-1) if N>0; = 1 otherwise; @end smallexample As indicated above, several equations for the same left-hand side can be factored to the left, omitting repetition of the left-hand side. Furthermore, the @code{otherwise} keyword may be used to denote the ``default'' alternative. The left-hand side of an equation may actually be an arbitrary compound expression pattern to be matched in the expression to be evaluated. For instance, as we have already seen, the expressions @code{[]} and @code{[X|Xs]} denote, respectively, the empty list and a list starting with head element @code{X} and continuing with a list @code{Xs} of remaining elements, just as in Prolog. We can use these patterns to define a function @code{add} which adds up the members of a list as follows: @smallexample add [] = 0; add [X|Xs] = X+add Xs; @end smallexample Thus no explicit operations for ``extracting'' the members from a compound data object such as a list are required; the components are simply retrieved by ``pattern matching''. This works in variable definitions as well. For instance, in the interpreter you can bind variables to the head element and the tail of a list using a single definition as follows: @smallexample ==> def [X|Xs] = [1,2,3]; X; Xs 1 [2,3] @end smallexample Pattern matching also works if the arguments take the form of function applications. For instance, the following definition implements an insertion operation on binary trees: @smallexample insert nil Y = bin Y nil nil; insert (bin X T1 T2) Y = bin Y T1 T2 if X=Y; = bin X (insert T1 Y) T2 if X>Y; = bin X T1 (insert T2 Y) if X hd [] hd [] ==> sin (X+1) sin (X+1) @end smallexample You may be disconcerted by the fact that expressions like @code{hd []} and @code{sin (X+1)} actually denote legal values in the Q language. However, this is one of Q's key features as a term rewriting programming language, and it adds a great amount of flexibility to the language. Most importantly, it allows expressions, even expressions involving variables, to be evaluated in a symbolic fashion. For instance, you might add rules for algebraic identities and symbolic differentiation to your script and have the interpreter simplify expressions according to these rules. On the other hand, the Q language also allows you to refine the definition of any built-in or user-defined operation and thus it is a simple matter to add an ``error rule'' like the following to your script when needed: @smallexample hd [] = error "hd: empty list"; @end smallexample (The @code{error} function is defined in the standard library script @code{error.q}, see @ref{Diagnostics and Error Messages}.) In order to implement more elaborate error handling, you can also use Q's built-in @code{throw} and @code{catch} functions to raise and respond to exceptions on certain error conditions. This is discussed in @ref{Exception Handling}. @node Runtime Errors, , Definitions, Getting Started @section Runtime Errors There are a few error conditions which may cause the Q interpreter to abort the evaluation with a runtime error message. A runtime error arises when the interpreter runs out of memory or stack space, when the user aborts an evaluation with the interrupt key (usually @code{Ctl-C}), and when the condition part of an equation does not evaluate to a truth value (@code{true} or @code{false}). (All these error conditions can also be handled by the executing script, see @ref{Exception Handling}.) You can use the @code{break on} command to tell the interpreter that it should fire up the debugger when one of the latter two conditions arises. After @code{Ctl-C}, you can then resume evaluation in the debugger, see @ref{Debugging}. This is not possible if the interpreter stumbles across an invalid condition part; in this case the debugger is invoked merely to report the equation which caused the error, and to inspect the call chain of the offending rule. Hitting the carriage return key will return you to the interpreter's evaluation loop. For instance, reload the @code{fac.q} script from @ref{Writing a Script}: @smallexample fac N = N*fac(N-1) if N>0; = 1 otherwise; @end smallexample @noindent Now enable the debugger with the @code{break} command and type some ``garbage'' like @code{fac fac}: @smallexample ==> break on; fac fac ! Error in conditional 0> fac.q, line 1: fac fac ==> fac*fac (fac-1) if fac>0 (type ? for help) : @end smallexample What happened? Well, the debugger informs us that the error occurred when the interpreter tried to apply the first equation, @smallexample fac N = N*fac(N-1) if N>0; @end smallexample @noindent to the expression @code{fac fac}. In fact this is no big surprise since we cannot expect the interpreter to know how to compare a symbol, @code{fac}, with a number, @code{0}, which it tried when evaluating the condition part of the above equation. So let us return to the interpreter's prompt by hitting the carriage return key: @smallexample : ==> @end smallexample The interpreter now waits for us to type the next expression. (For a summary of debugger commands please refer to @ref{Debugging}. You can also type @code{?} or @code{help} while in the debugger to obtain a list of available commands.) @node Lexical Matters, Scripts and Modules, Getting Started, Top @chapter Lexical Matters The vocabulary of the Q programming language consists of notations for identifiers, operators, integers, floating point numbers, character strings, comments, a few reserved words which may not be used as identifiers, and some special punctuation symbols which are used as delimiters. @menu * Whitespace and Comments:: * Identifiers and Reserved Words:: * Operator Symbols:: * Numbers:: * Strings:: * Unicode Support:: @end menu @node Whitespace and Comments, Identifiers and Reserved Words, Lexical Matters, Lexical Matters @section Whitespace and Comments Whitespace (blanks, tabs, newlines, form feeds) serves as a delimiter between adjacent symbols, but is otherwise ignored. @cindex comment Comments are treated like whitespace: @smallexample /* This is a comment ... */ @end smallexample Note that these comments cannot be nested. C++-style line-oriented comments are also supported: @smallexample // C++-style comment ... @end smallexample Furthermore, lines beginning with the @code{#!} symbol denote a special type of comment which may be processed by the operating system's command shell and the Q programming tools. On UNIX systems, this (odd) feature allows you to execute Q scripts directly from the shell (by specifying the @code{q} program as a command language processor) and to include compiler and interpreter command line options in a script (see @ref{Running Scripts from the Shell}). @node Identifiers and Reserved Words, Operator Symbols, Whitespace and Comments, Lexical Matters @section Identifiers and Reserved Words @cindex identifier @cindex function symbol @cindex variable symbol @cindex type symbol Identifiers are denoted by the usual sequences of letters (including @samp{_}) and digits, beginning with a letter. Upper- and lowercase is distinct. In the Q language, identifiers are used to denote @dfn{type}, @dfn{function} and @dfn{variable} symbols. Type identifiers may start with either an uppercase or lowercase letter (the convention, however, is to use an initial uppercase letter). The case of the first letter matters, however, for function and variable symbols. As in Prolog, a capitalized identifier (such as @code{X}, @code{Xmax} and @code{XMAX}) indicates a variable symbol, whereas identifiers starting with a lowercase letter denote function symbols (unless they are declared as ``free'' variables, see below). Unlike in Prolog, the underscore @samp{_} counts as a @emph{lowercase} letter, hence @code{_MAX} is a function symbol, not a variable. (The idea behind this is that it allows you to get a function symbol which appears to start with an uppercase letter by stropping it with an initial @samp{_}.) However, as an exception to the general rule, the identifier @samp{_} @emph{does} denote a variable symbol, the so-called @dfn{anonymous} variable. The same rules also apply to symbols created interactively in the interpreter. Variables actually come in two flavours: @dfn{bound} and @dfn{free} variables, i.e., variables which also occur on the left-hand side of an equation, and variables which only occur on the right-hand side and/or in the condition part of an equation. Identifiers may also be @emph{declared} as free variables; see @ref{Declarations}. In this case, they may also start with a lowercase letter. @cindex identifier, qualified Type, function and free variable identifiers may also be @dfn{qualified} with a @dfn{module} identifier prefix (cf. @ref{Scripts and Modules}), to specifically denote a symbol of the given module. Such a qualified identifier takes the form @code{@var{module-id}::@var{identifier}}; no whitespace or comments are allowed between the module name, @samp{::} symbol and the function or variable identifier. Formally, the syntax of identifiers is described by the following grammatical rules: @smallexample identifier : unqualified-identifier | qualified-identifier qualified-identifier : module-identifier '::' unqualified-identifier unqualified-identifier : variable-identifier | function-identifier | type-identifier module-identifier : letter @{letter|digitsym@} type-identifier : letter @{letter|digitsym@} variable-identifier : uppercase-letter @{letter|digitsym@} | '_' function-identifier : lowercase-letter @{letter|digitsym@} letter : uppercase-letter|lowercase-letter uppercase-letter : 'A'|...|'Z'|@var{uppercase unicode letter} lowercase-letter : 'a'|...|'z'|'_'|@var{lowercase unicode letter} digitsym : '0'|...|'9'|@var{unicode digit} @end smallexample (Please refer to @ref{Q Language Grammar}, for a description of the BNF grammar notation used throughout this document.) The following symbols are reserved words of the Q language and may not be used as identifiers: @cindex reserved words @smallexample as const def else extern from if import include otherwise private public special then type undef var virtual where @end smallexample In addition, the following identifiers are predeclared as operator symbols (see @ref{Operator Symbols}) and cannot be used as normal identifiers either: @smallexample and div mod not or @end smallexample @node Operator Symbols, Numbers, Identifiers and Reserved Words, Lexical Matters @section Operator Symbols @cindex operator symbols Operator symbols may either take the form of function identifiers or they may be sequences of punctuation characters (excluding @samp{_} which serves as a letter in the Q language). Like function and free variable symbols, they may be qualified with a module identifier prefix: @smallexample op : unary-op|binary-op unary-op : opsym binary-op : opsym|'and' 'then'|'or' 'else' opsym : unqualified-opsym | qualified-opsym qualified-opsym : module-identifier '::' unqualified-opsym unqualified-opsym : function-identifier | punctsym @{punctsym@} punctsym : @var{unicode punctuation character} @end smallexample In either case, operator symbols @emph{must} be declared explicitly before they can be used, cf. @ref{Declarations}. The declaration determines the precedence and ``fixity'' of the operator (i.e., whether it acts as a unary prefix or a binary infix operator; see @ref{User-Defined Operators}, for more information on this). @cindex operators, built-in As already mentioned, the following identifiers are predefined as built-in operators: @smallexample and div mod not or @end smallexample As indicated by the syntax rules, the logical connectives @samp{and} and @samp{or} can also be combined with the keywords @samp{then} and @samp{else} to form the ``short-circuit'' connectives @samp{and then} and @samp{or else}. The punctuation symbols predefined as built-in operators are the following: @smallexample ` ' ~ & . || < > = <= >= <> == ++ + - * / ^ ! # $ @end smallexample Most of these can actually be redeclared by the programmer for his own purposes. However, there are a few symbols, called ``soft delimiters'', which play a special role in the syntax of the Q language (some of these are also used as operator symbols): @smallexample ~ . .. : | = == - \ @@ @end smallexample The soft delimiters may occur @emph{inside} user-defined operators, but if they are used as separate lexemes then they are treated like reserved words and thus they may not be declared as operator symbols (except for the purpose of making aliases, cf. @ref{Declarations}). The same applies to the reserved keywords (see @ref{Identifiers and Reserved Words}). @cindex delimiters, special (lexical syntax) Moreover, the following special symbols serve as ``hard delimiters'' in the Q language which always separate lexemes and thus may not occur inside operator symbols at all: @smallexample " , ; :: ( ) [ ] @{ @} @end smallexample The same applies to whitespace and other non-printable characters, as well as the comment delimiters (@samp{//} and @samp{/*} as well as initial @samp{#!} on a script line). Symbols consisting of punctuation are generally parsed using the ``longest possible lexeme'' a.k.a. ``maximal munch'' rule. Here, the ``longest possible lexeme'' refers to the longest prefix of the input such that the sequence of punctuation characters either forms a @emph{valid} (i.e., declared) operator symbol, or one of the reserved and special delimiter symbols listed above. Thus, e.g., @samp{..#} will usually be parsed as @samp{.. #}, i.e., a reserved @samp{..} symbol followed by a @samp{#} operator. This holds unless the entire sequence @samp{..#} has already been declared as an operator in the scope where it is used. The only exception to the ``maximal munch'' rule occurs inside declarations where a new operator symbol is being introduced. In this case the symbol extends up to the next hard delimiter symbol (usually @samp{)} or @samp{;}) or whitespace/non-printable character in the input. For instance, the following Q code snippet declares a new binary operator symbol @samp{+~%} (please refer to @ref{Declarations}, for an explanation of the declaration syntax): @smallexample public (+~%) X Y; @end smallexample Another special case arises with conglomerates of three or more consecutive @samp{:} symbols. In this case the initial @samp{::} is always treated as the designation of a qualified (operator) symbol. (In the unlikely case that you ever need to specify a ``guarded variable'' of the form @code{X : ::Type} with a qualified type in the built-in namespace, the space between the @samp{:} token and the following @samp{::} qualification designator is mandatory.) @node Numbers, Strings, Operator Symbols, Lexical Matters @section Numbers @cindex numeric constants @cindex digit sequence @cindex scaling factor Signed decimal numeric constants are denoted by sequences of decimal digits (only the standard digits 0..9 in the ASCII character set may be used here) and may contain a decimal point and/or a scaling factor. Integers can also be denoted in octal or hexadecimal, using the same syntax as in C: @smallexample number : ['-'] unsigned-number unsigned-number : '0' octdigitseq | '0x' hexdigitseq | '0X' hexdigitseq | digitseq ['.' [digitseq]] [scalefact] | [digitseq] '.' digitseq [scalefact] digitseq : digit @{digit@} octdigitseq : octdigit @{octdigit@} hexdigitseq : hexdigit @{hexdigit@} scalefact : 'E' ['-'] digitseq | 'e' ['-'] digitseq digit : '0'|...|'9' octdigit : '0'|...|'7' hexdigit : '0'|...|'9'|'a'|...|'f'|'A'|...|'F' @end smallexample Simple digit sequences without decimal point and scaling factor are treated as integers; if the sequence starts with @samp{0} or @samp{0x}/@samp{0X} then it denotes an integer in octal or hexadecimal base, respectively. Other numbers denote (decimal) floating point values. If a decimal point is present, it must be preceded or followed by at least one digit. Both the scaling factor and the number itself may be prefixed with a minus sign. (Syntactically, the minus sign in front of a number is interpreted as unary minus, cf. @ref{Expressions}. However, if unary minus occurs in front of a number, it is interpreted as a part of the number and is taken to denote a negative value. See the remarks concerning unary minus in @ref{Expressions}.) Some examples: @smallexample 0 -187326 0.0 -.05 3.1415e3 -1E-10 0177 0xaf -0XFFFF @end smallexample @node Strings, Unicode Support, Numbers, Lexical Matters @section Strings @cindex string constants @cindex character sequences @cindex double quotes String constants are written as character sequences enclosed in double quotes: @smallexample string : '"' @{char@} '"' char : @var{any character but newline and @code{"}} @end smallexample @cindex escape sequences To include newlines, double quotes and other (non-printable) characters in a string, the following escape sequences may be used: @kindex \n @kindex \r @kindex \t @kindex \b @kindex \f @kindex \" @kindex \\ @cindex newline character @cindex tab character @smallexample \n @r{newline} \r @r{carriage return} \t @r{tab} \b @r{backspace} \f @r{form feed} \" @r{double quote} \\ @r{backslash} @end smallexample Furthermore, a character may also be denoted in the form @code{\N}, where @code{N} is the character number in decimal, hexadecimal or octal (using the same syntax as for unsigned integer values). Note that the character number may consist of an arbitrary number of digits; the resulting value will be taken modulo the size of the local character set, which is 256 in the case of ASCII-only systems and 0x110000 if the interpreter has been built with Unicode support (see @ref{Unicode Support}). Optionally, the character number may also be enclosed in parentheses; this makes it possible to specify a string in which a character escape is immediately followed by another character which happens to be a valid digit, as in @code{"\(123)4"}. As of version 7.11 and later, the interpreter also supports symbolic character escapes of the form @samp{\&@var{name};}, where @var{name} is any of the XML single character entity names specified in the ``XML Entity definitions for Characters'', see @url{http://www.w3.org/TR/xml-entity-names/}. Note that, at the time of this writing, this is still a W3C working draft, so the supported entity names may be subject to change until the final specification comes out; the currently supported entities are described in the draft from 14 December 2007, see @url{http://www.w3.org/TR/2007/WD-xml-entity-names-20071214/}. Also note that multi-character entities are @emph{not} supported in this implementation. A string may also be continued across lines by putting the @code{\} character immediately before the end of the line, which causes the following newline character to be ignored. As of Q 7.0, it is a syntax error if a character escape is not recognized as either a numeric escape or one of the special escapes listed above. Some examples: @smallexample "" @r{empty string} "A" @r{single character string} "\27" @r{ASCII escape character (ASCII 27)} "\033" @r{same in octal} "\0x1b" @r{same in hexadecimal} "\(0x1b)c" @r{ASCII escape followed by a literal @samp{c} character} "Gr\äf" @r{German umlaut, using XML entity escape} "a string" @r{multiple character string} "a \"quoted\" string" @r{include double quotes} "a line\n" @r{include newline} "a very \ @r{continue across line end} long line\n" @end smallexample @node Unicode Support, , Strings, Lexical Matters @section Unicode Support @cindex unicode Since version 7.0 the Q interpreter has full Unicode support. This means that identifiers may actually contain arbitrary alphanumeric letter and digit symbols from the entire Unicode character set. (All non-uppercase alphabetical letters are considered as lowercase letters which may begin a function identifier.) Likewise, operator symbols may consist of arbitrary punctuation characters in the Unicode character set. (We refrain from giving any concrete examples here, to keep this document 7-bit clean.) @cindex strings, unicode Moreover, all string values are internally represented using the UTF-8 encoding, which allows you to represent all characters in the Unicode character set, while retaining backward compatibility with 7 bit ASCII. String literals in the source script will be translated from the system encoding to UTF-8 automatically and transparently, and the @code{\N} notation may specify any valid Unicode point. For this to work, the interpreter must have been built with Unicode support enabled (which is the default). @node Scripts and Modules, Declarations, Lexical Matters, Top @chapter Scripts and Modules @cindex script The basic compilation unit in the Q language is the @dfn{script}, which is simply a (possibly empty) sequence of declarations and definitions in a single source file: @smallexample script : @{declaration|definition@} @end smallexample @cindex bytecode file In order to make a given set of definitions available for use in the interpreter, you can just collect the relevant declarations, variable definitions and equations in a script which you submit to the interpreter for execution. The interpreter in turn invokes the compiler to translate the script into a @dfn{bytecode} file which is loaded by the interpreter (see @ref{Using Q}). A script may also be empty, in which case it does not provide any definitions at all. @cindex module @kindex import @kindex include If you put all your definitions into a single script, that's all there is to it. However, you will often want to break down a larger script into a collection of smaller units which can be managed separately and which are linked together by the compiler into a single bytecode file. To these ends the Q language allows you to put your definitions into several separate script files, which are also called @dfn{modules}. To gain access to the function, variable and type symbols provided by another module, you must then use an @dfn{import} or @dfn{include} declaration, using the following syntax: @smallexample declaration : unqualified-import | qualified-import unqualified-import : 'import' module-spec @{',' module-spec@} ';' | 'include' module-spec @{',' module-spec@} ';' qualified-import : 'from' module-spec 'import' [symbol-specs] ';' | 'from' module-spec 'include' [symbol-specs] ';' module-spec : module-name ['as' unqualified-identifier] module-name : unqualified-identifier | string symbol-specs : symbol-spec @{',' symbol-spec@} symbol-spec : unqualified-identifier ['as' unqualified-identifier] | unqualified-opsym ['as' unqualified-opsym] @end smallexample As of version 7.8, Q supports both @dfn{unqualified} and @dfn{qualified} import clauses. The former allows you to quickly import an entire collection of modules, while the latter gives you precise control over which symbols are to be imported from which modules. We describe each of these in turn. @menu * Unqualified Imports:: * Qualified Imports:: * Implicit Imports and the Prelude:: * The Global Namespace:: @end menu @node Unqualified Imports, Qualified Imports, , Scripts and Modules @section Unqualified Imports @cindex unqualified import @cindex import, unqualified Easiest first, let as take a look at unqualified imports. The following declaration imports two modules @code{foo} and @code{bar}: @smallexample import foo, bar; @end smallexample If you put such declarations into your ``main script'' and then run the script with the interpreter, the imported modules will be loaded as well. To determine which functions, variables and types are actually available to be imported into another script, the Q language allows you to declare symbols either as ``public'' or ``private'', see @ref{Declarations}. Outside a module, only the public symbols are accessible when the module is imported in another module. Normally import is not ``transitive'', i.e., importing a module @code{foo} does @emph{not} automatically give you access to the symbols imported by @code{foo}, but only to the public symbols declared in @code{foo} itself. To work around this, instead of @code{import} you can also use an @code{include} declaration which causes all imports and includes of the included module to be ``reexported''. For instance, let us consider the case that module @code{foo} includes module @code{bar}: @smallexample include bar; @end smallexample Then by importing module @code{foo} you also gain access to the public symbols of module @code{bar} (and, recursively, to the modules included by @code{bar}, etc.). This provides a means to group different modules together in one larger ``umbrella'' module. For instance, the standard prelude script (cf. @ref{The Standard Library}) simply includes most of the other standard library modules. @cindex namespace @cindex identifier, qualified In the Q language, each module has its own separate namespace. This means that two different modules, say @code{foo1} and @code{foo2}, may both provide their own public function symbol named @code{foo}. If both @code{foo1} and @code{foo2} are imported in the same script, you can distinguish between the two by using a @dfn{qualified identifier}, using the module name as a prefix, e.g., @code{foo1::foo} or @code{foo2::foo}. (Writing simply @code{foo} in this case will produce an error message because the reference is ambiguous.) @cindex import, cyclic Import and include declarations can occur anywhere in a script (and will be active from this point on up to the end of the script file), but it is common (and recommended) practice to put them near the beginning of the script. As in most other programming languages, module dependencies must always be acyclic. If the compiler finds a cyclic chain of @code{import} and @code{include} declarations, such as a module importing itself or a module @code{foo} importing a module @code{bar} which in turn imports @code{foo} again, it produces an error message. Another caveat: When using unqualified import clauses, it is much too easy to accidentally ``reuse'' an imported symbol because of a missing local symbol declaration. For instance, if you import a module @code{foo} which happens to export a symbol @code{bar}, and then you later define a function @code{bar} in your script, your definition will use the imported @code{bar} symbol. This is perfectly legal in Q, and may be intended, but if it's not then you have to explicitly declare a new local @code{bar} symbol after the import clause, cf. @ref{Declarations}. If you forget this, you will errorneously redefine the existing @code{bar} function instead. One way to avoid this is to always use qualified imports, as described below. Another possibility is to run the interpreter with the @code{--pedantic} option, in which case it will warn you about symbols from unqualified imports which are referred to with unqualified identifiers, see @ref{Running Compiler and Interpreter}, for details. @cindex module name As indicated in the syntax rules, the names of the modules to be imported can either be specified using an unqualified identifier, or a string denoting a full or relative pathname, e.g.: @smallexample import "/home/ag/q/stuff.q"; @end smallexample @noindent In this case, you can also omit the @samp{.q} suffix, it will be supplied automatically when necessary. Moreover, the module identifier is automatically the basename of the script filename (i.e., @code{stuff} in the above example); therefore, it is a good idea to choose basenames which are legal Q identifiers so that you can use them in qualified symbols. If no absolute path is specified, the interpreter locates script files using its ``search path'', which usually contains the current directory and other system-dependent or user-defined locations where ``library'' scripts are kept, see @ref{Using Q}, for details. @kindex as The @samp{as} keyword can be used to explicitly specify an unambiguous module name. This is useful, in particular, if the basename of the module is @emph{not} a valid identifier, and for the purpose of resolving name clashes. For instance: @smallexample import "my/stdlib.q" as mystdlib; @end smallexample Simply importing @code{"my/stdlib.q"} under its own name in this case would produce an error message, because the name of the module collides with the standard library module of the same name, and the compiler enforces that all module names are unique. @node Qualified Imports, Implicit Imports and the Prelude, Unqualified Imports, Scripts and Modules @section Qualified Imports @cindex qualified import @cindex import, qualified A qualified import takes the form: @smallexample from foo import foo, BAR; @end smallexample This specifically imports the symbols @code{foo} and @code{BAR} from module @code{foo}, and nothing else. What this actually does is to ``redeclare'' the imported symbols in the namespace of the importing module, as explained in @ref{Cross-Checking of Declarations and Aliases}, so that they look like private symbols of the importing module. Thus you can refer to them, e.g., simply as @code{foo} or with the qualified name @code{gnats::foo}, where @code{gnats} is the name of the @emph{importing} module. Note that @code{foo::foo} will @emph{not} work in this case, because a qualified import does @emph{not} bring the module itself into scope. (However, as pointed out below you can use both a qualified @emph{and} an unqualified import in concert to make both @code{gnats::foo} and @code{foo::foo} work.) A note on terminology: In Q parlance, the terms @dfn{qualified import} and @dfn{unqualified import} apply to the import clauses themselves; a qualified import clause is called ``qualified'' simply because it restricts the set of imported symbols. Unqualified and qualified @emph{identifiers} can be used with both styles of import clauses alike. As with unqualified import, there is a second kind of qualified import clause which also reexports the imported symbols, by making them public members of the importing module: @smallexample from foo include foo, BAR; @end smallexample For convenience, you can also write: @smallexample from module import; @end smallexample @noindent or @smallexample from module include; @end smallexample @noindent without listing any symbols to just import or include @emph{all} symbols of the given module qualified. (It goes without saying that this should be used with care.) This is roughly equivalent to @samp{import module} or @samp{include module}, respectively, but the imported symbols are made available as members of the namespace of the client, not the imported module. If this sounds a bit confusing, here is an example which should clarify the differences between unqualified and qualified imports. Let's assume the following three modules @code{A}, @code{B} and @code{C}: @smallexample /* A: */ import B; /* B: */ include C; /* C: */ public foo; @end smallexample Using this setup, the symbol @code{foo} is available as either just @samp{foo} or as @samp{C::foo} in all of @code{A}, @code{B} and @code{C}. Now suppose we change module @code{B} to use a qualified @samp{include} instead: @smallexample /* B: */ from C include foo; /* or just: from C include; */ @end smallexample @code{C}'s namespace hasn't changed, of course, but in both @code{A} and @code{B} the symbol @code{foo} is now available as @samp{foo} or @samp{B::foo}, but @emph{not} as @samp{C::foo} anymore. However, you can also combine unqualified and qualified imports, like this: @smallexample /* B: */ include C; from C include foo; @end smallexample Now the symbol @code{foo} is available as either @samp{foo}, @samp{B::foo} @emph{or} @samp{C::foo} in @code{B} and @code{C} (and the compiler will recognize it as the same symbol no matter which notation you use). Finally, note that since symbols imported using a qualified clause become members of the importing namespace, they must not collide with other symbols declared either locally or in another qualified import clause. Thus, if you have two modules @code{foo1} and @code{foo2} which both export their own (local) symbol @code{foo}, the following will provoke an error message from the compiler because of the clash between the two different @code{foo} symbols: @smallexample from foo1 import foo; from foo2 import foo; @end smallexample In such cases you must either use unqualified import, or employ an @samp{as} clause to rename the symbols while importing them. E.g.: @smallexample from foo1 import foo as foo1; from foo2 import foo as foo2; @end smallexample Note that this is only a problem if the imported symbols are really distinct. If the symbols are just different ``aliases'' of the same symbol defined elsewhere, then you can just import them under the same name into the same scope without any problems. @node Implicit Imports and the Prelude, The Global Namespace, Qualified Imports, Scripts and Modules @section Implicit Imports and the Prelude @cindex implicit import @cindex import, implicit @cindex prelude @pindex prelude.q Actually, there is yet another kind of import, namely the @dfn{implicit import} of the @code{prelude.q} script at the beginning of each script, which in turn includes most standard library modules (see @ref{The Standard Library}) and makes them readily available in your program, without having to use an explicit import declaration. When looking up an unqualified symbol in a given script, the compiler first searches for a symbol defined in that script, then for a symbol in an imported or included module, then for a symbol defined by the prelude and its includes, and finally for a built-in symbol. You can instruct the compiler to inhibit the default import of the prelude, by using the @code{--no-prelude} option, see @ref{Using Q}. Moreover, you can also override the standard prelude with your own, if it occurs on the search path @emph{before} the standard prelude, see @ref{Setting up your Environment}. @node The Global Namespace, , Implicit Imports and the Prelude, Scripts and Modules @section The Global Namespace The namespace available in the interpreter is always that of the main script, i.e., the script given on the command line. This namespace also includes the @dfn{built-in namespace} which contains all the built-in symbols, as well as the implicit imports, i.e., the prelude and all its includes (unless the @code{--no-prelude} option was used). The built-in namespace can be accessed explicitly by using an empty module qualifier, as in @code{::sin}. Qualified prelude symbols use the corresponding module name as the qualifier (e.g., @code{stdlib::cat}), just as with explicit imports. The interpreter also allows you to dynamically import additional modules in the global scope using the @code{import} command, see @ref{Command Language}. Moreover, to facilitate testing and debugging, in the interpreter it is possible to gain access to @emph{all} public and private symbols of the program (also in modules not directly imported in the main script) using qualified identifiers. @node Declarations, Expressions, Scripts and Modules, Top @chapter Declarations @cindex declarations The Q language allows you to declare function and (free) variable symbols explicitly, by means of the syntactic constructs discussed in this chapter. Symbol declarations are mostly optional; if you introduce a new symbol without declaring it, the compiler declares it for you. However, you will sometimes wish to ensure that a new symbol is created to override a symbol of the same name from an imported module, or you want to attach special attributes to a symbol, and then you have to use an explicit declaration. Explicit declarations are also needed to introduce new operator and type symbols. Moreover, while implicit declarations are often convenient when you want to get something done quickly, for larger projects you might prefer to play it safe and ensure that each function symbol actually has a proper explicit declaration. In such cases you can run the compiler with the @code{--pedantic} option (see @ref{Running Compiler and Interpreter}) to warn you about undeclared occurrences of an identifier. @menu * Declaration Syntax:: * Operator Declarations:: * Cross-Checking of Declarations and Aliases:: * Type Declarations:: @end menu @node Declaration Syntax, Operator Declarations, Declarations, Declarations @section Declaration Syntax Syntactically, symbol declarations take the following form: @smallexample declaration : prefix headers ';' | [scope] 'type' unqualified-identifier [':' identifier] ['=' sections] ';' | [scope] 'extern' 'type' unqualified-identifier [':' identifier] ['=' sections] ';' | [scope] 'type' qualified-identifier ['as' unqualified-identifier] ';' | [scope] 'type' unqualified-identifier '==' identifier ';' prefix : scope | [scope] modifier @{modifier@} scope : 'private'|'public' modifier : 'const'|'special'|'extern'|'var'|'virtual' headers : header @{',' header@} header : unqualified-identifier '=' expression | unqualified-identifier @{['~'] variable-identifier@} | qualified-identifier @{['~'] variable-identifier@} ['as' unqualified-identifier] | '(' unqualified-opsym ')' @{['~'] variable-identifier@} ['@@' precedence] | '(' qualified-opsym ')' @{['~'] variable-identifier@} ['@@' precedence] ['as' unqualified-opsym] precedence : unsigned-number | '(' operator ')' sections : section @{'|' section@} section : [prefix] headers @end smallexample For instance, the following are all valid symbol declarations: @smallexample public foo; private extern bar X; public special lambda X Y; special cond::ifelse ~P X Y as myifelse; public (--) Xs Ys; private (::++) Xs Ys as concat; const red, green, blue; var FOO, BAR; var const e = exp 1; type Day = const sun, mon, tue, wed, thu, fri, sat; public type BinTree = virtual bintree Xs | private const nil, bin X T1 T2; @end smallexample @kindex private @kindex public @cindex scope @cindex public symbol @cindex private symbol The keywords @code{private} and @code{public} specify the @dfn{scope} of a symbol. @dfn{Public} symbols are accessible outside a script, and can be imported by other modules (see @ref{Scripts and Modules}). If the keywords @code{private} and @code{public} are omitted, the scope defaults to private. @kindex special @kindex const @kindex extern @kindex var @kindex virtual @cindex constructor @cindex virtual constructor @cindex constant symbol @cindex extern symbol @cindex variable symbol, free The @code{special} keyword serves to declare @dfn{special forms}, which are described in @ref{Special Forms}. Function symbols declared with @code{const} introduce ``constant'' or ``constructor'' symbols; the compiler enforces that expressions created with such symbols are not redefined in an equation (cf. @ref{Equations}). The built-in constants @code{true}, @code{false}, @code{[]} and @code{()} are already predeclared as @code{const}. The @code{virtual} keyword serves to declare ``virtual constructors'' which can be used to define concrete representations (so-called ``views'') of abstract data types, see @ref{Views}. Non-@code{const} function symbols can also be declared as @code{extern}, meaning that the corresponding function is actually implemented by a corresponding ``external'' module, see @ref{C Language Interface}. The @code{var} keyword allows you to declare ``free'' variable symbols, which can be assigned a value by means of a @dfn{variable definition}, see @ref{Free Variables}. If you use such a declaration, the variable symbol may also start with a lowercase letter (if it has not already been declared as a function symbol); note that without such a declaration, an identifier starting with a lowercase letter will implicitly be declared as a function symbol. (The idea behind this is that you can make a variable look like a function symbol, if the variable is actually supposed to be used for function values.) Variable symbols can also be declared as @code{const}; in this case the variable can only be assigned to @emph{once} and always refers to the same value once it has been defined. The built-in variables @code{INPUT}, @code{OUTPUT}, @code{ERROR} and @code{ARGS} are predeclared as @code{const}, see @ref{Command Language}. A variable declaration may also contain an ``initializer'' of the form @code{= X}, where @code{X} is an expression. This has the effect of declaring and defining the variable in a single step. @cindex scope-modifier prefix @cindex modifier @cindex header @cindex argument list @cindex arity As indicated, the scope-modifier prefix of a declaration is followed by a comma-separated list of @dfn{headers}. The first identifier in each header states the symbol to be declared. In function symbol declarations the function identifier may be followed by a list of variable identifiers which are used to denote the arguments of the declared function symbol. The variables are effectively treated as comments; only their number (called the @dfn{arity} of a function symbol) is recorded to check the consistency of different declarations of the same symbol, and to specify the number of ``special'' arguments in a @code{special} declaration. In @code{special} declarations variables may also be prefixed with @samp{~} to declare them as ``non-special'' arguments; see @ref{Special Forms}. @cindex default declaration The first declaration of an unqualified identifier or operator symbol in a module always introduces a new symbol in the module's namespace. By default (if no explicit declaration precedes the first use of a new, unqualified symbol), the compiler automatically declares it as a @code{private} nullary function or variable symbol, depending on the case of the initial letter of the identifier (as already mentioned in @ref{Lexical Matters}, capitalized identifiers are interpreted as variable symbols, others as function symbols). @node Operator Declarations, Cross-Checking of Declarations and Aliases, Declaration Syntax, Declarations @section Operator Declarations @cindex operator declaration As of Q 6.2, it is also possible to declare new prefix and infix @dfn{operator symbols}. Such declarations are never optional; operator symbols must always be declared before they are used. Syntactically, they take exactly the same form as a function symbol declaration, but the operator symbol is given inside parentheses. As discussed in @ref{Lexical Matters}, the operator symbol may either be a function identifier or a sequence of punctuation characters. Examples: @smallexample public (myop) X Y; public (--) X Y; @end smallexample You can also specify a precedence level, like so: @smallexample public (myop) X Y @@ 2; // use explicit precedence level ... public (myop) X Y @@ (<); // ... or precedence level of given operator @end smallexample If no precedence is given, it defaults to 2, which is the precedence of the relational operators. This is described in more detail in @ref{User-Defined Operators}. @node Cross-Checking of Declarations and Aliases, Type Declarations, Operator Declarations, Declarations @section Cross-Checking of Declarations and Aliases @cindex multiple declarations @cindex declarations, consistency The Q language allows multiple declarations of the same symbol, and the compiler will verify the consistency of all declarations of a given symbol. That is, if a symbol is declared with different scope (@code{private} or @code{public}), attributes (like @code{var}, @code{const} or @code{special}) or different number of arguments, or if non-special arguments of a special form are declared differently, then the compiler will issue an error message. Note, however, that the compiler will @emph{never} cross-check the number of parameters in a function symbol declaration with the actual number of arguments to which the function is applied when it is used in an equation or a variable definition. This is because in Q it is perfectly legal to have a function symbol applied to a varying number of arguments for the purpose of constructing partial applications. Also note that if a local function symbol is first used without an explicit declaration then it will be implicitly declared to have a zero argument count. If the symbol is then later declared with a nonzero number of arguments or other non-default attributes then the compiler will print an error message. @cindex redeclaration @cindex reexport @cindex aliasing As indicated by the syntactic rules, it is also possible to @dfn{redeclare} a @emph{qualified} symbol. This requires that the target module either is the current module or has already been imported in the current scope using an unqualified import clause, and causes the compiler to both cross-check the declaration with the declaration in the imported module and redeclare the symbol within the current namespace. Such a redeclaration serves several different purposes. First and foremost, it allows you to ensure that an imported symbol was actually declared with the given attributes. Second, it lets you resolve name clashes by redeclaring the symbol in the current scope where it overrides unqualified imports of other modules; if you want, you can also import the symbol under a new name using an @samp{as} clause (the new symbol is then also referred to as an @dfn{alias} of the original one). In these two cases the symbol will normally be redeclared as a private symbol. Third, by redeclaring a symbol as @code{public}, you cause the symbol to be @dfn{reexported} by the current module. Examples: @smallexample import gnats; private gnats::foo X Y; // cross-check declaration of gnats::foo public gnats::foo X Y as bar; // reexport gnats::foo as bar @end smallexample Of course, the same also works with operator symbols. E.g., here is how you declare a new operator symbol @code{concat} which behaves exactly like the built-in @samp{++} operator: @smallexample public (::++) X Y as concat; @end smallexample There is yet another usage of an imported symbol redeclaration, namely the @code{extern} redeclaration. This is only possible with function symbols and if the redeclared symbol is not already declared @code{extern} by another module. It instructs the compiler that this module provides an external definition of the symbol which will override equational definitions in other modules, see @ref{C Language Interface}, for details. Note that, as of Q 7.8, most of the uses of qualified symbol redeclarations are now covered in a more convenient fashion by qualified imports, see @ref{Qualified Imports}. Thus qualified symbol redeclarations will now mostly be used for cross-checking declarations and for external overrides. @node Type Declarations, , Cross-Checking of Declarations and Aliases, Declarations @section Type Declarations @cindex type declaration @cindex scope, symbol in type declaration A @dfn{type declaration} introduces a type identifier, an optional supertype for the type, and an optional list of ``constructor'' symbols for the type. It associates the function symbols in the list with the given type. The symbol list is a @code{|}-delimited list of individual @dfn{sections}. Each section takes the form of an ordinary symbol declaration, consisting of scope/modifier prefix and a comma-delimited list of headers. The prefix is optional; by default, a symbol has the same scope as the type it belongs to. To change scope and attributes of a symbol on the list, you use an appropriate scope/modifier prefix. For instance, you can declare a public @code{BinTree} type with private constructors @code{nil} and @code{bin} as follows: @smallexample public type BinTree = private const nil, bin X T1 T2; @end smallexample Each constructor symbol may only be declared once; otherwise the compiler will issue an error message. Hence the compiler enforces that a constructor symbol only belongs to a single type, and that the number of arguments of that symbol is determined uniquely. @cindex virtual constructor A constructor symbol may also be declared as @code{virtual}, which indicates that it may be used in concrete representations of abstract data types, so-called ``views''. Such symbols are usually non-@code{const} symbols implementing a public construction function which delivers values of the type they belong to, see @ref{Views}, for details. For instance, to add a virtual constructor @code{bintree} to the above @code{BinTree} type, you might declare the type as follows: @smallexample public type BinTree = virtual bintree Xs | private const nil, bin X T1 T2; @end smallexample @cindex type identifier Type identifiers may begin with either an upper- or lowercase letter (the convention, however, is to use capitalized identifiers). There may only be a single declaration for each type. As of Q 7.8, type identifiers must also be distinct from function or variable symbols declared in the same namespace, as both type and function/variable identifiers may now occur in the same context (namely, a qualified import clause). @cindex predefined type symbols Types are used on the left-hand side of equations to restrict the set of expressions which can be matched by a variable; see @ref{Type Guards}, for details. The Q language has a number of predefined type symbols which distinguish the corresponding types of objects built into the language: @code{Bool}, @code{Int}, @code{Float}, @code{Real} (which is the supertype of both @code{Int} and @code{Float}), @code{Num} (the supertype of @code{Real}), @code{String}, @code{Char} (the subtype of @code{String} denoting the single-character strings), @code{File}, @code{List}, @code{Stream}, @code{Tuple} and @code{Function}. (Besides this, there are also two built-in types for exceptions, see @ref{Exception Handling}.) Like function symbols, types imported from other modules can also be redeclared (possibly under a new name), and reexported, e.g.: @smallexample public type array::Array as MyArray; @end smallexample In this case, no constructor symbols or supertype are specified. (As of Q 7.8, it is also possible to use a qualified import clause to achieve this, see @ref{Qualified Imports}. Also note that if a type is imported using a qualified import clause, then its constructor symbols will @emph{not} be imported automatically; you will have to explicitly list them in the import clause if you need them.) As of Q 7.1, there is an alternative, more customary syntax for introducing type aliases which looks as follows: @smallexample type Integer == Int; @end smallexample Note that here the alias to be defined is specified first, and the existing type to be aliased can be specified without a module qualifier. This works exactly like an ordinary alias declaration, but resembles more closely the syntax commonly used for defining type synonyms in other languages like Pascal and Haskell. Types can also be declared as @code{extern}, to indicate that they are realized in a corresponding C module, as described in @ref{C Language Interface}. External types may have a supertype, but only virtual constructors. For instance: @smallexample extern type Bar = virtual bar X; @end smallexample In difference to function symbols, an existing type imported from another module cannot be redeclared as @code{extern}. Therefore an external definition must always be given by the module which originally declares the type. @node Expressions, Equations and Expression Evaluation, Declarations, Top @chapter Expressions @cindex expression The notion of expressions is an intrinsic part of most, if not all, programming languages. However, in most programming languages expressions are merely treated as descriptions of computations to be performed in order to obtain the value of an expression, which usually belongs to some basic type provided by the language. In contrast, in the Q language expressions are objects in their own right which, as we shall see, are manipulated according to computational rules specified by the programmer. @cindex basic expressions @cindex compound expressions @cindex prefix operator symbols @cindex infix operator symbols @cindex list enumerations @cindex list comprehensions @cindex stream enumerations @cindex stream comprehensions @cindex tuple enumerations @cindex tuple comprehensions As usual, expressions are built from certain kinds of basic objects. In the Q language, these are integers, floating point numbers, strings, function and variable symbols. These objects are combined into larger, compound expressions by means of applications, tuples, lists and streams. For convenience, the Q language also has prefix, infix and mixfix operator symbols for the most common operations; these are actually treated as ``syntactic sugar'' for applications. Syntactic sugar is also provided for list, tuple and stream enumerations and comprehensions. The syntax of all these entities is described by the following grammatical rules: @smallexample expression : identifier | 'var' unqualified-identifier | variable-identifier ':' identifier | number | string | expression expression | unary-op expression | expression binary-op expression | 'if' expression 'then' expression ['else' expression] | '\' expression @{ expression @} '.' expression | '(' [element-list|enumeration|comprehension] ')' | '[' [element-list|enumeration|comprehension] ']' | '@{' [element-list|enumeration|comprehension] '@}' | '(' op ')' | '(' expression binary-op ')' | '(' binary-op expression ')' element-list : expression-list [',' | ';' | '|' expression] enumeration : expression-list '..' [expression] comprehension : expression ':' expression-list expression-list : expression @{',' expression@} | expression @{';' expression@} @end smallexample @menu * Constants and Variables:: * Applications:: * Lists Streams and Tuples:: * Built-In Operators:: * User-Defined Operators:: @end menu @node Constants and Variables, Applications, Expressions, Expressions @section Constants and Variables @cindex integer @cindex floating point number @cindex numeric constants, range and precision The nonterminals @code{identifier}, @code{op}, @code{unary-op}, @code{binary-op}, @code{number} and @code{string} occuring in the syntax rules above refer to the lexical entities introduced in @ref{Lexical Matters}. Note that the Q language actually distinguishes two different types of identifiers (function and variable identifiers), two different kinds of operator symbols (unary and binary) and two different types of numeric quantities (integers and floating point numbers). Integers are implemented as ``bignums'' using the GNU multiprecision package (GMP); thus the size of an integer value is only limited by available memory. Floating point values are implemented using 64 bit (i.e., double precision) floating point numbers; on most machines, these should provide nonzero absolute values ranging from 1.7E-308 to 1.7E308 and a precision of 15 decimal digits. @cindex string @cindex character sequences @cindex null character @kindex \0 String constants are character sequences internally represented as character arrays permitting constant-time access to the individual characters of a string. There is no a priori length restriction for string constants. In the current implementation, the null character @code{\0} is reserved as a string terminator and must not be used as an ordinary string character. @cindex built-in constants @cindex truth values @cindex empty list @cindex empty stream @cindex empty tuple Furthermore, the Q language provides five other built-in constants which denote the truth values (@code{true} and @code{false}), and the empty list, stream and tuple (@code{[]}, @code{@{@}} and @code{()}, to be discussed in @ref{Lists Streams and Tuples}). @cindex variable symbol, bound @cindex variable symbol, free A variable symbol can be ``bound'' or ``free'', depending on the context in which it occurs. We say that a variable is @dfn{bound} in an equation if it occurs on the left-hand side of the equation. Otherwise, the variable is a @dfn{free} variable. In this case, the variable may denote a value (introduced with @code{def}), or simply stands for itself. In any case, a variable is a legal atomic expression which may stand wherever a constant is allowed. On the @emph{left-hand side} of equations, it is possible to declare that a variable is of a given @dfn{type} by using the notation: @code{@var{variable}:@var{type}}. This requires that you have previously declared @var{type} as a type identifier, see @ref{Declarations}. When a type identifier is specified with a variable, the variable will only match values belonging to the given type, cf. @ref{Type Guards}. @cindex variable declaration, inline On the @emph{right-hand side} of equations, an unqualified identifier can also be prefixed with the keyword @code{var} to indicate that the identifier is to be treated as a free variable symbol. Such @dfn{inline} variable declarations can be used to escape free variables (if the same identifier is also introduced as a bound variable on the left-hand side of the equation) and to declare free variable symbols on the fly, see @ref{Local Variables}, for details. @node Applications, Lists Streams and Tuples, Constants and Variables, Expressions @section Applications @cindex application @cindex function @cindex argument Application is probably the most important single construct in the Q language. It allows you to apply one object (the ``function'') to another one (the ``argument''). This construct is used to denote ``function calls'' such as @code{sqrt 2} as well as ``constructor terms'' such as @code{bin X T1 T2} which encode tree-like data structures. Also, as we will see in @ref{Built-In Operators}, operators like @code{+}, @code{-}, etc. are merely syntactic sugar for applications. As in other contemporary functional languages, application is a binary operation written simply as juxtaposition: @code{X Y} denotes the application of @code{X} to @code{Y}, both of which may be arbitrary expressions themselves. Application associates to the left; the construct @code{X Y1 @dots{} Yn} is equivalent to @code{(@dots{}((X Y1) Y2) @dots{}) Yn}, and denotes the application of @code{X} to n arguments @code{Y1}, @dots{}, @code{Yn}. This style of writing function applications is commonly referred to as @dfn{currying}, after the American logician H.B. Curry. We will have to say more about this shortly. Here are some valid examples of applications: @smallexample sqrt 2 sqrt (Y+1) foo X (bar (Y-1)) Z @end smallexample Note that since application is left-associative, nested applications in arguments must be set off with parentheses. For instance, @code{foo (bar X)} applies @code{foo} to @code{bar X}, whereas @code{foo bar X} applies @code{foo} to two arguments @code{bar} and @code{X}. @cindex currying Since currying is so ubiquitous in functional programming, you should be well familiar with it, so let us briefly explain what it is, and what it is good for. Functions of multiple arguments can generally be defined in two different ways: @itemize @bullet @item As a function taking a single, structured argument (usually a tuple). This is the traditional, ``uncurried'' method one commonly encounters in mathematics. For instance, we might define the function @code{max}, which computes the maximum of two values, as follows: @smallexample max (X,Y) = X if X>Y; = Y otherwise; @end smallexample @item Recursively, as a function which, when given the first argument, yields another function of the remaining arguments. Such functions are called @dfn{curried}. For instance, here is a curried definition for the @code{max} function: @smallexample max X Y = X if X>Y; = Y otherwise; @end smallexample @noindent Here @code{max X Y} is to be read as @code{(max X) Y}, meaning that by applying @code{max} to the first argument @code{X}, we obtain another function @code{max X}, which, when applied to the second argument @code{Y}, yields the maximum of @code{X} and @code{Y}. (Incidentally, this curried form is also the way that the @code{max} function is actually defined in the standard library. In fact, most built-in and standard library functions use the curried form as it is more flexible.) @end itemize The Q language supports both kinds of notations. Choosing between the two is more than just a matter of taste. Besides saving parentheses, curried functions provide a cheap way to derive new functions through ``partial applications'' of a multi-argument function. For instance, given the curried definition of @code{max} from above, @code{max 0} can be used to denote the function computing the ``nonnegative part'' of its argument (which is the argument itself if it is nonnegative, and zero otherwise). This does not work with the uncurried definition since that definition requires us to specify both arguments of @code{max} in one chunk; instead, we would have to start defining the derived function from scratch. @cindex variadic functions Uncurried definitions also have their merits. In particular, they allow us to define @dfn{variadic} functions, i.e., functions which can handle a varying number of arguments. For instance, the following definition enables us to apply the @code{max} function to both pairs and triples: @smallexample max (X,Y) = X if X>=Y; = Y otherwise; max (X,Y,Z) = max (X,max (Y,Z)) @end smallexample In fact, in the Q language it is possible to define generic functions which apply to any number of arguments specified as a tuple. For instance, the following version of @code{max} handles tuples of any size at least 2: @smallexample max (X,Y) = X if X>=Y; = Y otherwise; max (X,Y|Zs) = max (X,max (Y|Zs)); @end smallexample @noindent (In the above example, the notation @code{(@dots{}|Zs)} is used to denote a tuple with ``tail'' @code{Zs}. This is explained in the following section.) @node Lists Streams and Tuples, Built-In Operators, Applications, Expressions @section Lists, Streams and Tuples The Q language provides three closely related general constructs for representing sequences of objects: lists, streams and tuples. @cindex list We start out with a discussion of the @dfn{list} data structure. The constant @code{[]} denotes the empty list. In general, a list consisting of elements @code{X1}, @code{X2}, @dots{}, @code{Xn} is denoted @code{[X1,X2,@dots{},Xn]}. For instance, @code{[a,b,c]} consists of three elements (symbols) @code{a}, @code{b} and @code{c}. In contrast to statically typed languages like Haskell and ML, list members may have different types; e.g., @code{[a,10,"a string"]} is a perfectly well-formed list consisting of a symbol, a number and a string. It is also possible to have nested lists, as in @code{[a,[b,c]]} which consists of two elements, the symbol @code{a} and the list @code{[b,c]}. You can also have a trailing comma in a list, as in @code{[a,b,c,]}. The trailing comma is ignored, so the above is exactly the same as @code{[a,b,c]}. This makes it possible to format a list as follows, which lets you add new items at the end more easily: @smallexample def ITEMS = [ "This is the first item", "This is the second item", @dots{} "This is the last item", ]; @end smallexample As in Prolog, lists are actually represented in a right-recursive fashion using a binary constructor @code{[|]} which takes as its arguments the @dfn{head} element of the list and the list of the remaining elements, called the @dfn{tail} of the list. Thus, e.g., @code{[a,b,c]} is simply a shorthand notation for @code{[a|[b,c]]}, which denotes a list with head @code{a} and tail @code{[b,c]} (which is in turn a shorthand for @code{[b|[c]]}, etc.). You can also mix both styles of notation; for instance, @code{[a,b|[c,d]]} is another way to represent the 4-element list @code{[a,b,c,d]}. Note that @code{[a|[b,c]]} is different from @code{[a,[b,c]]}: the former denotes a three-element list, while the latter is a two-element list whose second member happens to be a list itself. Also note that the @code{[|]} constructor can in fact be applied to any pair of values (the second value does not have to be a list); e.g., @code{[a|b]} is a perfectly well-formed expression (although the built-in length, indexing and concatenation operations described in @ref{String List and Tuple Operators}, will fail on such values). @cindex enumeration @cindex list comprehensions Q also provides so-called @dfn{enumerations} to construct lists from a range of values. These usually take the form @code{[X1,X2,@dots{},Xn..Y]} which is just syntactic sugar for the application @code{enum [X1,X2,@dots{},Xn] Y} involving the built-in @code{enum} function, see @ref{Enumeration Types}, for details. Syntactic sugar is also provided for @dfn{list comprehensions}, which are discussed in @ref{Conditionals and Comprehensions}. Analogous constructs are also provided for streams and tuples. @cindex stream @dfn{Streams} look exactly like lists except that they are written with curly braces. The difference between lists and streams is that the head and tail of a stream are not evaluated until their values are actually required during the course of a computation. For instance, you will find that if you enter an expression like @code{@{1+1,2+2,3+3@}} in the interpreter, then the value of that expression will be just the expression itself, because the embedded applications of the @samp{+} operator are not evaluated: @smallexample ==> @{1+1,2+2,3+3@} @{1+1,2+2,3+3@} @end smallexample In contrast, if you enter the same sequence as a list, its elements will be evaluated immediately: @smallexample ==> [1+1,2+2,3+3] [2,4,6] @end smallexample A stream element will be evaluated automatically when it is accessed, e.g., using the index operator: @smallexample ==> @{1+1,2+2,3+3@}!1 4 @end smallexample This is also known as @dfn{lazy} or @dfn{non-strict evaluation}: the evaluation of the stream elements is deferred until they are actually needed. In fact, the entire tail of a stream may be deferred, which makes it possible to create infinite sequences whose members are produced ``on the fly'' while traversing the sequence. Streams are implemented in terms of so-called ``special forms'' which will be introduced in @ref{Special Forms}. We therefore postpone a closer description of this data structure until @ref{Special Constructors and Streams}. Also note that the Q language itself only defines the stream constructors; most other stream functions are actually implemented by the standard library script @code{stream.q}, see @ref{Streams}. @cindex tuple @dfn{Tuples} also work in much the same fashion as lists: The constant @code{()} denotes the empty tuple, and a tuple consisting of elements @code{X1}, @code{X2}, @dots{}, @code{Xn} is written as @code{(X1,X2,@dots{},Xn)}, which is equivalent to @code{(X1|(X2|@dots{}|(Xn|())))}, where the notation @code{(X|Xs)} denotes a tuple consisting of a first element @code{X} and the tuple @code{Xs} of the remaining elements. A trailing comma may follow the last tuple element, so that @code{(a,b,c,)} is the same as @code{(a,b,c)}. The trailing comma is optional in all cases except one: As in the Python programming language, 1-tuples are indicated with a trailing comma, to distinguish them from ordinary parenthesized expressions. Hence @code{(a,)} denotes the 1-tuple with the single member @code{a}, while @code{(a)} just denotes @code{a} itself. Note that in contrast to languages like ML and Haskell, tuples are not actually needed to represent sequences of values with different types, as Q's list data structure already allows you to do that. Also, Q's tuples are a lot more flexible than in ML and Haskell, as they can be handled mostly like list values. The big difference between lists and tuples in the Q language is that, while lists are always represented as recursive data objects using a binary constructor symbol (just the way that they are written), tuples are actually implemented as ``vectors'' which are stored as contiguous sequences in memory. (This only works for ``well-formed'' tuples. If the tail @code{Xs} of a tuple @code{(X|Xs)} is not a tuple, then this value can only be represented using an explicit application of the tuple constructor.) Therefore tuples normally use up much less memory than lists of the same size, and they also allow constant-time access to their members. The size of a tuple can be determined in constant time as well. In contrast, the same operations, when applied to a list, require time proportional to the size of the list. On the other hand, lists are more efficient when accessing the tail of a list and when a new element is prepended to a list, which can both be done in constant time. Here a tuple needs time proportional to its size, since the member sequence of the original tuple must be copied when accessing its tail or when prepending an element. These tradeoffs should be carefully considered when deciding whether to implement a given sequence as a list or a tuple. Tuples are usually the best choice for implementing fixed sequences requiring fast random access to its individual members, whereas lists provide the most efficient representation if the elements are traversed and manipulated sequentially. If necessary, you can easily convert between these two representations using the built-in functions @code{list} and @code{tuple}, see @ref{Conversion Functions}. @cindex grouping Lists, streams and tuples are frequently combined to create nested structures like a tuple of lists and/or streams, a list (or stream) of lists, or a list (or stream, or tuple) of tuples. The latter construct (sequences of tuples) is so common that Q provides special support for it in the form of the following ``grouping'' syntax. The notation @code{[X1, X2, @dots{}; Y1, Y2, @dots{}; @dots{}]} is a shorthand for @code{[(X1, X2, @dots{}), (Y1, Y2, @dots{}), @dots{}]}. Note that there must be at least one element in each group (but the number of elements in different groups may vary), and there must be at least one group (a single group is denoted with a trailing @samp{;}). Thus, for instance, @code{[A,B;C,D;X,Y,Z]} is the same as @code{[(A,B),(C,D),(X,Y,Z)]}, and @code{[X,Y,Z;]} is the same as @code{[(X,Y,Z)]}. The same notation also applies to streams and tuples. @node Built-In Operators, User-Defined Operators, Lists Streams and Tuples, Expressions @section Built-In Operators @cindex operator symbols @cindex operator precedence Besides the forms of expressions already discussed above, the Q language also provides a collection of prefix, infix and mixfix operator symbols for common arithmetic, logical, relational and other operations. A complete list of these symbols is given below, in order of decreasing precedence: @table @code @item ' ` ~ & quotation operators (unary) @item . function composition @item ^ ! exponentiation and subscript @item - # not prefix operators (unary) @item * / div mod and and-then multiplication operators @item ++ + - or or-else addition operators @item < > = <= >= <> == relational operators @item $ infix application operator @item if-then-else conditional expressions @item || sequence operator @item \X.Y lambda abstractions @end table @cindex operator associativity @cindex parenthesized expressions Most of these symbols have their usual meaning; a closer description follows below. All binary operators are left-associative, with the exception of @code{^}, @code{!} and @code{$} which associate to the right, and the relational operators which are nonassociative. Application takes precedence over all these operations except the quotation operators; hence @code{sqrt X^3} denotes @code{(sqrt X)^3} and @emph{not} @code{sqrt (X^3)}. The quotation operators have the highest possible precedence, and hence bind stronger than even applications. Parentheses are used to group expressions and overwrite default precedences and associativity as usual. C programmers will also note that the logical operators have the same ``wrong'' precedence as in Pascal. Thus you should make sure that you always parenthesize relational expressions when combining them with logical connectives. @cindex unary minus @cindex negative numbers You should also note that unary minus @emph{must} be parenthesized when appearing in an argument of a function application. For instance, although @code{foo X -Y} is a perfectly well-formed expression, it denotes @code{(foo X) - Y} and @emph{not} @code{foo X (-Y)} as you might have intended by the spacing which is however ignored by the Q compiler. As already noted in @ref{Lexical Matters}, unary minus in front of a number is special; it causes values such as @code{-2} to be interpreted as negative numbers rather than denoting an explicit application of the unary minus operator (an explicit application of unary minus can be denoted using the built-in @code{minus} symbol; see below). The rules for parenthesizing negative numbers are the same as those for unary minus; you have to write @code{foo X (-2)} instead of @code{foo X -2} (which denotes @code{(foo X) - 2}). @cindex operator, as prefix function @cindex operator section In the Q language, expressions involving prefix and infix operators are merely syntactic sugar for applications. By enclosing such an operator in parentheses, you can turn it into an ordinary prefix function. For instance, @code{X+Y} is exactly the same as @code{(+) X Y}, and @code{not X} actually denotes the application @code{(not) X}. The built-in operators simply provide a more convenient way for applying these operations to their arguments, which resembles common mathematical notation. Moreover, you can also turn a binary operator into a unary function by specifying either the left or the right operand. E.g., @code{(1/)} denotes the reciprocal and @code{(*2)} the doubling function. Such constructs are commonly called @dfn{operator sections}. Inside a section, the usual precedence and associativity rules apply. For instance, the @code{X+3} subterm in @code{(*(X+3))} @emph{must} be parenthesized because @code{*} has a higher precedence than @code{+}, and hence the partial expression @code{(*X+3)} is not well-formed. The @code{-} operator plays a somewhat awkward role in the syntax, since it is used to denote both unary and binary minus. Q adopts the convention that the notation @code{(-)} always denotes @emph{binary} minus; unary minus may be denoted by the built-in @code{minus} function. Thus the expression @code{minus X} applies unary minus to @code{X}. Note that this construct always denotes an explicit application of the unary minus operation. For instance, @code{minus 5} denotes the application of unary minus to the integer @code{5}, while @code{-5} is a negative integer. Also note that the construct @code{(-X)} is @emph{not} an operator section, but a parenthesized expression involving unary minus. The easiest way to construct an operator section which subtracts a given value from its argument is to formulate the function using the addition operator as in @code{(+(-X))}. Another special case are the mixfix operators @code{if-then-else} (conditional expression) and @code{\X.Y} (lambda abstraction). These are just syntactic sugar for the standard library functions @code{ifelse}/@code{when} and the built-in function @code{lambda}, so no special syntax is needed to turn them into prefix functions. @menu * Quotation Operators:: * Function Composition:: * Arithmetic Operators:: * Relational Operators:: * Logical and Bit Operators:: * String List and Tuple Operators:: * Application and Sequence Operators:: * Conditional Expressions and Lambdas:: @end menu @node Quotation Operators, Function Composition, Built-In Operators, Built-In Operators @subsection Quotation Operators @cindex quotation operators The @code{'} (quote), @code{`} (backquote), @code{~} (tilde) and @code{&} (ampersand) operators are used to deal with so-called @dfn{special forms}. The quote operator quotes an expression as a literal; it is a constructor symbol and hence becomes part of the quoted expression. The backquote and tilde operators are used to ``splice'' and ``force'' subterms in an expression. The ampersand operator causes ``special arguments'' to be memoized. We postpone a discussion of these operations until @ref{Special Forms}. @node Function Composition, Arithmetic Operators, Quotation Operators, Built-In Operators @subsection Function Composition @cindex function composition The @samp{.} operator denotes function composition: @code{(F.G) X = F (G X)}. For instance, to double a value and then add 1 to it, you can use an expression like @code{((+1).(*2)) X}. Note the parentheses around the composition -- as application binds stronger than composition, @code{F.G X} is interpreted as @code{F.(G X)} rather than @code{(F.G) X}. If a composition involves applications with integer arguments, extra whitespace may be needed to distinguish between the @samp{.} operator and the decimal point; e.g., @code{F 5 . G} denotes a composition of the functions @code{F 5} and @code{G}, whereas @code{F 5.G} is an application of the function @code{F} to two arguments, the floating point value @code{5.} and the function @code{G}. Also note that the composition operator is associative, since @code{(F.G.H) X = ((F.G).H) X = F (G (H X)) = (F.(G.H)) X}. @node Arithmetic Operators, Relational Operators, Function Composition, Built-In Operators @subsection Arithmetic Operators @cindex arithmetic operators @kindex div @kindex mod The operators @code{+}, @code{-}, @code{*}, @code{/} denote the sum, the difference, the product and the quotient of two numeric values, respectively. As already noted, @code{-} is also used as unary minus. The operators @code{div} and @code{mod} denote integer quotient and modulo, respectively; they only apply to integers. The @code{^} operator denotes exponentiation, as defined by @code{X^Y = exp (ln X*Y)}; it always returns a floating point value. (If @code{X<0} then @code{X^Y} is defined only if @code{Y} is an integer. @code{0^0} is left undefined as well, so if you would like to have that @code{0^0 = 1} then you must add corresponding equations yourself. Also note that the @code{complex.q} standard library module extends the built-in definition of the exponentiation operator to handle the case that @code{X<0} with general exponent; see @ref{Complex Numbers}.) The argument and corresponding result types of these operations are summarized in the following table (@code{Int} denotes integer, @code{Float} floating point, and @code{Real} integer or floating point values): @iftex @table @code @item + - * @code{Int} @code{Int} @math{@rightarrow} @code{Int} @* @code{Int} @code{Float} @math{@rightarrow} @code{Float} @* @code{Float} @code{Int} @math{@rightarrow} @code{Float} @* @code{Float} @code{Float} @math{@rightarrow} @code{Float} @item / ^ @code{Real} @code{Real} @math{@rightarrow} @code{Float} @item div mod @code{Int} @code{Int} @math{@rightarrow} @code{Int} @item - (unary) @code{Int} @math{@rightarrow} @code{Int} @* @code{Float} @math{@rightarrow} @code{Float} @end table @end iftex @ifinfo @table @code @item + - * @code{Int} @code{Int} -> @code{Int} @* @code{Int} @code{Float} -> @code{Float} @* @code{Float} @code{Int} -> @code{Float} @* @code{Float} @code{Float} -> @code{Float} @item / ^ @code{Real} @code{Real} -> @code{Float} @item div mod @code{Int} @code{Int} -> @code{Int} @item - @r{(unary)} @code{Int} -> @code{Int} @* @code{Float} -> @code{Float} @end table @end ifinfo Note that, as of Q 7.2, all floating point operations properly deal with IEEE floating point infinities and NaN (``not a number'') values. Consequently, division by zero now yields one of the special values @code{inf}, @code{-inf} or @code{nan}, depending on the value of the first operand. (Of course, this will only work as described on systems with a proper IEEE floating point implementation. The interpreter makes no attempt to emulate IEEE floating point values on systems which do not provide them natively.) @node Relational Operators, Logical and Bit Operators, Arithmetic Operators, Built-In Operators @subsection Relational Operators @cindex relational operators @cindex string comparison @cindex lexicographic order, strings The operators @code{<}, @code{>}, @code{=}, @code{<=}, @code{>=}, @code{<>} are binary predicates meaning ``less'', ``greater'', ``equal'', ``less or equal'', ``greater or equal'' and ``not equal'', respectively. The built-in definition of these operations only applies to numbers, strings and truth values. All relational operators return truth values (@code{true}, @code{false}) as results. Strings are compared lexicographically, on the basis of the character encoding (which, as of Q 7.0, usually is Unicode). Truth values are ordered by @code{false < true}. If you would like to compare other types of values than the basic objects mentioned above, normally you will have to provide suitable definitions yourself. For instance, you might wish to extend the equality operation to other built-in and user-defined data structures such as lists, trees, etc., by ``overloading'' the @code{=} operator accordingly. The following equations implement an equality predicate on lists (the parentheses on the left-hand side are necessary to prevent the equality operation to be interpreted as the equal sign separating left-hand and right-hand side of an equation): @smallexample ([] = []) = true; ([] = [Y|Ys]) = false; ([X|Xs] = []) = false; ([X|Xs] = [Y|Ys]) = (X=Y) and then (Xs=Ys); @end smallexample Rules for other comparison operators (@code{<>}, @code{<=}, etc.) could be added in the same fashion. Actually, the standard library provides corresponding definitions; see @ref{The Standard Library}. A special case arises with types consisting of only nullary constructor symbols declared with @code{const}, so-called @dfn{enumeration types}, see @ref{Types}. The values of such a type can always be compared with the relational operators, using the order in which they are listed in the type declaration. (A special case of this is the order of the built-in truth values.) For instance, assume that the @code{Day} type is declared as follows: @smallexample type Day = const sun, mon, tue, wed, thu, fri, sat; @end smallexample @noindent Then the listed constants will be ordered as @code{sun < mon < @dots{} < sat}. @cindex syntactic equality Besides the