StandardMLGotchas
=================
:toc:

This page contains brief explanations of some recurring sources of
confusion and problems that SML newbies encounter.

Many confusions about the syntax of SML seem to arise from the use of
an interactive REPL (Read-Eval Print Loop) while trying to learn the
basics of the language.  While writing your first SML programs, you
should keep the source code of your programs in a form that is
accepted by an SML compiler as a whole.

== The `and` keyword ==

It is a common mistake to misuse the `and` keyword or to not know how
to introduce mutually recursive definitions.  The purpose of the `and`
keyword is to introduce mutually recursive definitions of functions
and datatypes.  For example,

[source,sml]
----
fun isEven 0w0 = true
  | isEven 0w1 = false
  | isEven n = isOdd (n-0w1)
and isOdd 0w0 = false
  | isOdd 0w1 = true
  | isOdd n = isEven (n-0w1)
----

and

[source,sml]
----
datatype decl = VAL of id * pat * expr
           (* | ... *)
     and expr = LET of decl * expr
           (* | ... *)
----

You can also use `and` as a shorthand in a couple of other places, but
it is not necessary.

== Constructed patterns ==

It is a common mistake to forget to parenthesize constructed patterns
in `fun` bindings.  Consider the following invalid definition:

[source,sml]
----
fun length nil = 0
  | length h :: t = 1 + length t
----

The pattern `h :: t` needs to be parenthesized:

[source,sml]
----
fun length nil = 0
  | length (h :: t) = 1 + length t
----

The parentheses are needed, because a `fun` definition may have
multiple consecutive constructed patterns through currying.

The same applies to nonfix constructors.  For example, the parentheses
in

[source,sml]
----
fun valOf NONE = raise Option
  | valOf (SOME x) = x
----

are required.  However, the outermost constructed pattern in a `fn` or
`case` expression need not be parenthesized, because in those cases
there is always just one constructed pattern.  So, both

[source,sml]
----
val valOf = fn NONE => raise Option
             | SOME x => x
----

and

[source,sml]
----
fun valOf x = case x of
                 NONE => raise Option
               | SOME x => x
----

are fine.

== Declarations and expressions ==

It is a common mistake to confuse expressions and declarations.
Normally an SML source file should only contain declarations.  The
following are declarations:

[source,sml]
----
datatype dt = ...
fun f ... = ...
functor Fn (...) = ...
infix ...
infixr ...
local ... in ... end
nonfix ...
open ...
signature SIG = ...
structure Struct = ...
type t = ...
val v = ...
----

Note that

[source,sml]
----
let ... in ... end
----

isn't a declaration.

To specify a side-effecting computation in a source file, you can write:

[source,sml]
----
val () = ...
----


== Equality types ==

SML has a fairly intricate built-in notion of equality.  See
<:EqualityType:> and <:EqualityTypeVariable:> for a thorough
discussion.


== Nested cases ==

It is a common mistake to write nested case expressions without the
necessary parentheses.  See <:UnresolvedBugs:> for a discussion.


== (op *) ==

It used to be a common mistake to parenthesize `op *` as `(op *)`.
Before SML'97, `*)` was considered a comment terminator in SML and
caused a syntax error.  At the time of writing, <:SMLNJ:SML/NJ> still
rejects the code.  An extra space may be used for portability:
`(op * )`. However, parenthesizing `op` is redundant, even though it
is a widely used convention.


== Overloading ==

A number of standard operators (`+`, `-`, `~`, `*`, `<`, `>`, ...) and
numeric constants are overloaded for some of the numeric types (`int`,
`real`, `word`).  It is a common surprise that definitions using
overloaded operators such as

[source,sml]
----
fun min (x, y) = if y < x then y else x
----

are not overloaded themselves.  SML doesn't really support
(user-defined) overloading or other forms of ad hoc polymorphism.  In
cases such as the above where the context doesn't resolve the
overloading, expressions using overloaded operators or constants get
assigned a default type.  The above definition gets the type

[source,sml]
----
val min : int * int -> int
----

See <:Overloading:> and <:TypeIndexedValues:> for further discussion.


== Semicolons ==

It is a common mistake to use redundant semicolons in SML code.  This
is probably caused by the fact that in an SML REPL, a semicolon (and
enter) is used to signal the REPL that it should evaluate the
preceding chunk of code as a unit.  In SML source files, semicolons
are really needed in only two places.  Namely, in expressions of the
form

[source,sml]
----
(exp ; ... ; exp)
----

and

[source,sml]
----
let ... in exp ; ... ; exp end
----

Note that semicolons act as expression (or declaration) separators
rather than as terminators.


== Stale bindings ==

{empty}


== Unresolved records ==

{empty}


== Value restriction ==

See <:ValueRestriction:>.


== Type Variable Scope ==

See <:TypeVariableScope:>.
