Using

Building Projects With make

Using make to build projects with jhc is straightforward, simply add a line like the following in your Makefile

% : %.hs
        jhc -v $< -o $@

If you wish jhc to automatically generate dependency information, you can do the following

% : %.hs
        jhc --deps depend.make -v $< -o $@

-include depend.make



%.hl : %.cabal
        jhc -v --build-hl $< -o $@

Options

Usage: jhc [OPTION...] Main.hs
  -V              --version              print version info and exit
                  --version-context      print version context info and exit
                  --help                 print help information and exit
                  --config               show a variety of config info
  -v              --verbose              chatty output on stderr
  -z                                     Increase verbosity of statistics
  -d [no-]flag                           dump specified data during compilation
  -f [no-]flag                           set or clear compilation options
  -o FILE         --output=FILE          output to FILE
  -i DIR          --include=DIR          where to look for source files
  -I DIR                                 add to preprocessor include path
  -D NAME=VALUE                          add new definitions to set in preprocessor
                  --optc=option          extra options to pass to c compiler
                  --progc=gcc            c compiler to use
                  --arg=arg              arguments to pass interpreted program
  -N              --noprelude            no implicit prelude
  -C                                     Typecheck, compile ho and grin.
  -c                                     Typecheck and compile ho.
                  --interpret            interpret.
  -k              --keepgoing            keep going on errors.
                  --width=COLUMNS        width of screen for debugging output.
                  --main=Main.main       main entry point.
  -m arch         --arch=arch            target architecture.
                  --entry=<expr>         main entry point, showable expression.
  -e <statement>                         run given statement as if on jhci prompt
                  --debug                debugging
                  --show-ho=file.ho      Show ho file
                  --noauto               Don't automatically load base and haskell98 packages
  -p file.hl                             Load given haskell library .hl file
  -L path                                Look for haskell libraries in the given directory.
                  --build-hl=file.cabal  Build hakell library from given library description file
                  --interactive          run interactivly
                  --ignore-ho            Ignore existing haskell object files
                  --nowrite-ho           Do not write new haskell object files
                  --no-ho                same as --ignore-ho and --nowrite-ho
                  --ho-cache=HOCACHEDIR  Use a global ho cache located at the argument
                  --ho-dir=<dir>         Where to place and look for ho files
                  --dependency           Follow import dependencies only then quit
                  --no-follow-deps       Don't follow depencies not listed on command line.
                  --list-libraries       List of installed libraries.

valid -d arguments: 'help' for more info
    all-dcons, all-kind, all-types, aspats, bindgroups, boxy-steps, class, class-summary, core
    core-afterlift, core-beforelift, core-initial, core-mangled, core-mini, core-pass, core-steps
    datatable, dcons, decls, defs, derived, e-alias, e-info, e-size, e-verbose, eval, exports, grin
    grin-graph, grin-initial, grin-normalized, grin-pass, grin-posteval, grin-preeval, grin-steps
    html, imports, instance, kind, kind-steps, optimization-stats, parsed, preprocessed, program
    progress, renamed, rules, rules-spec, scc-modules, sigenv, square-stats, srcsigs, stats, steps
    tags, the, types, tyvar, verbose, veryverbose

valid -f arguments: 'help' for more info
    boehm, controlled, cpp, cpr, debug, default, defaulting, ffi, float-in, global-optimize
    inline-pragmas, jgc, lint, m4, monomorphism-restriction, negate, profile, raw, rules, strictness
    type-analysis, unboxed-tuples, unboxed-values, via-ghc, wrapper

Code Options

Various options affecting how jhc interprets and compiles code can be controlled with the '-f' flag, the following options are availible, you can negate any particular one by prepending 'no-' to it.

Code options
cpp	pass haskell source through c preprocessor
ffi	support foreign function declarations
m4	pass haskell source through m4 preprocessor
unboxed-tuples	allow unboxed tuple syntax to be recognized
unboxed-values	allow unboxed value syntax

Typechecking
defaulting	perform defaulting of ambiguous types
monomorphism-restriction	enforce monomorphism restriction

Debugging
lint	perform lots of extra type checks

Optimization Options
cpr	do CPR analysis
float-in	perform float inward transform
global-optimize	perform whole program E optimization
inline-pragmas	use inline pragmas
rules	use rules
strictness	perform strictness analysis
type-analysis	perhaps a basic points-to analysis on types right after method generation

Code Generation
boehm	use Boehm garbage collector
debug	enable debugging code in generated executable
full-int	extend Int and Word to 32 bits on a 32 bit machine (rather than 30)
jgc	use the jgc garbage collector
profile	enable profiling code in generated executable
raw	just evaluate main to WHNF and nothing else.
via-ghc	compile via ghc
wrapper	wrap main in exception handler

Default settings
default	inline-pragmas rules wrapper float-in strictness defaulting type-analysis monomorphism-restriction boxy eval-optimize global-optimize

Dumping Debugging Information

You can have jhc print out a variety of things while running as Controlled by the '-d' flag. The following is a list of possible parameters you can pass to '-d'.

Front End
defs	Show all defined names in a module
derived	show generated derived instances
exports	show which names are exported from each module
imports	show in scope names for each module
parsed	parsed code
preprocessed	code after preprocessing/deliting
renamed	code after uniqueness renaming
scc-modules	show strongly connected modules in dependency order

Type Checker
all-dcons	show unified data constructor table
all-kind	show unified kind table after everything has been typechecked
all-types	show unified type table, after everything has been typechecked
aspats	show as patterns
bindgroups	show bindgroups
boxy-steps	show step by step what the type inferencer is doing
class	detailed information on each class
class-summary	summary of all classes
dcons	data constructors
decls	processed declarations
instance	show instances
kind	show results of kind inference for each module
kind-steps	show steps of kind inference
program	impl expls, the whole shebang.
sigenv	initial signature environment
srcsigs	processed signatures from source code
types	display unified type table containing all defined names
tyvar	show original tyvars rather than renaming them.

Intermediate code
core	show intermediate core code
core-afterlift	show final core before writing ho file
core-beforelift	show core before lambda lifting
core-initial	show core right after E.FromHs conversion
core-mangled	de-typed core right before it is converted to grin
core-mini	show details even when optimizing individual functions
core-pass	show each iteration of code while transforming
core-steps	show what happens in each pass
datatable	show data table of constructors
e-alias	show expanded aliases
e-info	show info tags on all bound variables
e-size	print the size of E after each pass
e-verbose	print very verbose version of E code always
optimization-stats	show combined stats of optimization passes
rules	show all user rules and catalysts
rules-spec	show specialization rules

Grin code
eval	show detailed eval inlining info
grin	show final grin code
grin-graph	print dot file of final grin code to outputname_grin.dot
grin-initial	grin right after conversion from core
grin-normalized	grin right after first normalization
grin-pass	show each iteration of code while transforming
grin-posteval	show grin code just before eval/apply inlining
grin-preeval	show grin code just before eval/apply inlining
grin-steps	show what happens in each transformation
steps	show interpreter go
tags	list of all tags and their types

General
html	use html escape codes in output
progress	show basic progress indicators
square-stats	use square corners rather than curved ones, for compatibility
stats	show extra information about stuff
verbose	progress
veryverbose	progress stats

Pragmas

Function Properties

These must appear in the same file as the definition of a function. To apply one to a instance or class method, you must place it in the where clause of the instance or class declaration.

NOINLINE : Do not inline the given function during core transformations. The function may be inlined during grin transformations.

INLINE : Inline this function whenever possible

SUPERINLINE : Always inline no matter what, even if it means making a local copy of the functions body.

NOETA : When applied to a class method, do not perform eta expansion up to the number of arguments specified by the type.

Rules/Specializations

RULES : rewrite rules. These have the same syntax and behave like GHC's rewrite rules, except 'phase' information is not allowed.

SPECIALIZE : create a version of a function that is specialized for a given type

SUPERSPECIALIZE : has the same effect as SPECIALIZE, but also places a run-time check in the generic version of the function to determine whether to call the specialized version.

These pragmas are only valid in the 'head' of a file, meaning they must come before the initial 'module' definition and in the first 4096 bytes of the file and must be preceded by and contain only characters in the ASCII character set.

NOPRELUDE : do not load the 'Prelude' automatically. equivalent to passing --noprelude on the command line.

OPTIONS_JHC : Specify extra options to use when processing this file. The options available are equivalent to the command line options, though, not all may have meaning when applied to a single file.

LANGUAGE : Specify various language options

Extensions

Top Level Actions

Jhc supports monadic actions declared at the top level of your module. These can be used to do things such as initialize IORefs or allocate static data. An example of a top level action is the following.

import Jhc.ACIO
import Data.IORef

ref <- newIORefAC 0

count = do
    modifyIORef ref (1 +)
    readIORef ref >>= print

main = do
    count
    count
    count

Which will print 1, 2, and 3. A special monad ACIO (which stands for Affine Central IO) is provided to restrict what may take place in top level actions. Basically, top level actions can only consist of IO that can be omitted or reordered without changing the meaning of a program. In practice, this means that it does not matter whether such actions are all performed at the beginning or are only computed once on demand.

If you need to use arbitrary IO, a utility function 'runOnce' is provided. using it you can ensure arbitrary IO actions are run only once and the return values shared, however you must access the value inside the IO monad, thus ensuring program integrity. An example using a hypothetical GUI library is below.

import Jhc.ACIO

getWindow <- runOnce $ do
    connection <- newGUIConnection
    window <- createWindow (640,480)
    setTitle window "My Global Window"
    return window

main = do
    w <- getWindow
    draw w "Hello!"

Note, top level global variables can be indicative of design issues. In general, they should only be used when necessary to interface with an external library, opaque uses inside a library where the shared state can not be externally observed, or inside your Main program as design dictates.

Unboxed Values

Unboxed values in jhc are specified in a similar fashion to GHC however the lexical syntax is not changed to allow # in identifiers. # is still used in the syntax for various unboxed constructs, but normal Haskell rules apply to other Haskell values. The convention is to suffix such types with '_' to indicate their status as unboxed.

Unboxed Tuples

Jhc supports unboxed tuples with the same syntax as GHC, (# 2, 4 #) is an unboxed tuple of two numbers. Unboxed tuples are enabled with -funboxed-tuples

Unboxed Strings

Unboxed strings are enabled with the -funboxed-values flag. They are specified like a normal string but have a '#' at the end. Unboxed strings have types 'Addr_' which is as synonym for 'BitsPtr'

Unboxed Numbers

Unboxed numbers are enabled with the -funboxed-values flag. They are postpended with a '#' such as in 3# or 4#. Jhc supports a limited form of type inference for unboxed numbers, if the type is fully specified by the environment and it is a suitable unboxed numeric type then that type is used. Otherwise it defaults to Int__.

Differences

Differences from Haskell 98

Language Differences

Class contexts on data types are silently ignored.
Class methods are fully 'eta expanded' out to the argument count specified by the type. This is often beneficial as instances that need to share partial applications are rare. This behavior can be turned off with the NOETA pragma for specific methods.

Library Changes

In addition to a larger set of base libraries roughly modeled on GHC's base. Jhc provides a number of extensions/minor modifications to the standard libraries. These are designed to be mostly backwards compatible and most are to the class system.

Data.Bits
- Num is no longer a super class of Data.Bits. It never should have been.
- There are new methods logicalShiftR and arithmeticShiftR that do a logical and arithmetic shift respectively. shiftR will always map to one of those as appropriate.
- shiftR and shiftL do not check for negative arguments, if you might want negative arguments, use the general 'shift' routine. 'shift' also comes in logical and arithmetic varieties.

Library Additions

There are many other additional libraries provided with jhc, here I list only changes that affect modules that are defined by the haskell 98 or FFI specifications.

Data.Int and Data.Word provide WordPtr, WordMax, IntPtr, and IntMax that correspond to the C types uintptr_t, uintmax_t, intptr_t, and intmax_t respectively.
fromInt,toInt,fromDouble,toDouble have been added alongside Integer and Rational routines in their respective classes.
floating point truncation and rounding functions have varieties that don't return an integral type, but rather return something of the same type as its argument. These have the same name but end in 'f'.

Notable Differences from GHC

Jhc differs from GHC in certain ways that are allowed by Haskell 98, but might come as a surprise to some.

An Int may be only 30 bits and may not observe simple binary truncation on overflow. If you need known bit width and binary semantics for your numbers then use the types in Data.Int and Data.Word. Overflow on Int or Word has undefined results.
A Char may only preserve values within the Unicode range. Storing values greater than 0x10FFFF has undefined results.
The Int and Word types are at most 32 bits, even on 64 bit architectures.
All text based IO is performed according to the current locale. This means that Unicode works seamlessly, but older programs that assumed IO was performed by simple truncation of chars down to 8 bits will fail. Use the explicit binary routines if you need binary IO.

Differences That are Considered Misfeatures

These misfeatures will be fixed at some point.

Integer corresponds to IntMax rather than an arbitrary precision type. As soon as a suitable arbitrary precision library emerges, it will be replaced.
Ix is not derivable.

Internals

The Run Time System

Jhc is very minimalist in that it does not have a precompiled run time system, but rather generates what is needed as part of the compilation process. However, we call whatever conventions and binary layouts used in the generated executable the run time system. Since jhc generates the code anew each time, it can build a different 'run time' based on compiler options, trading things like the garbage collector as needed or changing the closure layout when we know we have done full program optimization. This describes the 'native' layout upon which other conventions are layered.

A basic value in jhc is represented by a 'smart pointer' of c type sptr_t. a smart pointer is the size of a native pointer, but can take on different roles depending on a pair of tag bits.

smart pointers take on a general form as follows:

-------------------------
|    payload        | GL|
-------------------------

  G - if set, then the garbage collector should not treat value as a pointer to be followed
  L - lazy, this bit being set means the value is not in WHNF

A raw sptr_t on its own in the wild can only take on one of the following values:

-------------------------
|    raw value      | 10|
-------------------------

-------------------------
|    whnf location  | 00|
-------------------------

-------------------------
|   lazy location   | 01|
-------------------------

A raw value can be anything and not necessarily a pointer in general, a WHNF location is a pointer to some value in WHNF. The system places no restrictions on what is actually pointed to by a WHNF pointer, however the garbage collector in use may. In general, the back end is free to choose what to place in the raw value field or in what a WHNF points to with complete freedom. If an implementation sees the L bit is clear, it can pass on the smart pointer without examining it knowing the value is in WHNF.

A lazy location points to a potential closure or an indirection to a WHNF value. The lazy location is an allocated chunk of memory that is at least one pointer long. the very first location in a closure must be one of the following.

-------------------------
| raw value or whnf  |X0|
-------------------------

An evaluated value, interpreted exactly as above. one can always replace any occurance of a lazy location with an evaluated indirecton.

-------------------------
|    code pointer   | 11|
-------------------------
|     data ...          |

This is something to evaluate, code pointer is a pointer to a function that takes the memory location as its only argument, the called function is in charge of updating the location if needed.

note that it is invalid to have a lazy location point to another lazy location. there is only ever one level of indirection allowed, and only from lazy locations

note that a partial application is just like any other value in WHNF as far as the above is concered. It happens to possibly contain a code pointer.

Jhc core normalized forms

Jhc core has a number of 'normalized forms' in which certain invarients are met. many routines expect code to be in a certain form, and guarentee theier output is also in a given form. The type system also can change with each form by adding/removing terms from the PTS axioms and rules.

normalized form alpha : There are basically no restrictions other than the code is typesafe, but certain constructs that are checked by the type checker are okay when they wouldn't otherwise be. In particular, 'newtype' casts still exist at the data level. 'enum' scrutinizations are creations may be in terms of the virtual constructors rather than the internal representations. let may bind unboxed values, which is normaly not allowed.

normalized form beta : This is like alpha except all data type constructors and case scrutinizations are in their final form. As in, newtype coercions are removed, Enums are desugared etc. also, 'let' bindings of unboxed values are translated to the appropriate 'case' statements. The output of E.FromHs is in this form.

normalized form blue : This is the form that most routines work on.

normalized form larry : post lambda-lifting

normalized form mangled : All polymorphism has been replaced with subtyping

Jhc Core Type System

Jhc's core is based on a pure type system. A pure type system (also called a PTS) is actually a parameterized set of type systems. Jhc's version is described by the following.

Sorts  = (*,!,**,#,(#),##)
Axioms = (*::**,#::##,(#)::##,!::**)


*   is the sort of boxed values
!   is the sort of boxed strict values
**  is the supersort of all boxed value
#   is the sort of unboxed values
(#) is the sort of unboxed tuples
##  is the supersort of all unboxed values

in addition there exist user defined kinds, which are always of supersort ##

The following Rules table shows what sort of abstractions are allowed, a rule of the form (A,B,C) means you can have functions of things of sort A to things of sort B and the result is something of sort C. Function in this context subsumes both term and type level abstractions. Notice that functions are always boxed, but may be strict if they take an unboxed tuple as an argument. (TODO: explain strict in this context) These type system rules apply to lambda abstractions. it is possible to inherit values from the environment that would not be typable via lambda abstractions. for instance, although a data constructor may have a functional type, it was not created via a lambda abstraction so these rules do not apply.

as a shortcut we will use *# to mean either * or # and so forth
so (*#,*#,*) means (*,*,*) (#,*,*) (*,#,*) (#,#,*)

Rules =
   (*#!,*#!,*)  -- functions from values to values are boxed and lazy
   (*#!,(#),*)  -- functions from values to unboxed tuples are boxed and lazy
   ((#),*#!,!)  -- functions from unboxed tuples to values are boxed and strict
   ((#),(#),!)  -- functions from unboxed tuples to unboxed tuples are boxed and strict
   (**,*,*)     -- may have a function from an unboxed type to a value
   (**,#,*)
   (**,!,*)
   (**,**,**)  -- we have functions from types to types
   (**,##,##)  -- Array__ a :: #

The defining feature of boxed values is

_|_ :: t iff t::*

This PTS is functional but not injective