Hacked up the Prolog compiler code.
This commit is contained in:
parent
0801c9216a
commit
41fe48ec83
46
README
46
README
|
|
@ -1,10 +1,10 @@
|
|||
|
||||
|
||||
Thun
|
||||
Thun
|
||||
|
||||
A dialect of Joy in Python.
|
||||
Dialects of Joy in Python and Prolog.
|
||||
|
||||
v0.2.0
|
||||
v0.3.0
|
||||
|
||||
|
||||
--------------------------------------------------
|
||||
|
|
@ -33,37 +33,10 @@ Thun. If not see <http://www.gnu.org/licenses/>.
|
|||
§.1 Introduction
|
||||
|
||||
Joy is a programming language created by Manfred von Thun that is easy to
|
||||
use and understand and has many other nice properties. This Python
|
||||
package implements an interpreter for a dialect of Joy that attempts to
|
||||
use and understand and has many other nice properties. This project
|
||||
implements Python and Prolog interpreters for dialects that attempts to
|
||||
stay very close to the spirit of Joy but does not precisely match the
|
||||
behaviour of the original version(s) written in C. The main difference
|
||||
between Thun and the originals, other than being written in Python, is
|
||||
that it works by the "Continuation-Passing Style".
|
||||
|
||||
As I study Joy I find that it is very aptly named. It is clear, concise,
|
||||
and ameniable to advanced techniques for constructing bug-free software.
|
||||
|
||||
Developed by Manfred von Thun, don't know much about him, not much on the
|
||||
web about Joy and von Thun (Von Thun?) See references below.
|
||||
|
||||
Because it has desirable properties (concise, highly factored) the
|
||||
programming process changes, the ratio of designing to writing code
|
||||
shifts in favor of design. The documentation becomes extensive while the
|
||||
code shrinks to stable bodies of small well-factored incantations that
|
||||
are highly expressive, much like mathematical papers consist of large
|
||||
bodies of exposition interlaced with mathematical formula that concisely
|
||||
and precisely express the meaning of the text.
|
||||
|
||||
The time and attention of the programmer shifts from thinking about the
|
||||
language to thinking in the language, and the development process feels
|
||||
more like deriving mathematical truths than like writing ad-hoc
|
||||
solutions.
|
||||
|
||||
I hope that this package is useful in the sense that it provides an
|
||||
additional joy interpreter (the binary in the archive from La Trobe seems
|
||||
to run just fine on my modern Linux machine!) But I also hope that you
|
||||
can read and understand the Python code and play with the implementation
|
||||
itself.
|
||||
behaviour of the original version written in C.
|
||||
|
||||
The best source (no pun intended) for learning about Joy is the
|
||||
information made available at the website of La Trobe University (see the
|
||||
|
|
@ -180,14 +153,13 @@ TODO:
|
|||
§.4.6 Refactoring
|
||||
|
||||
|
||||
|
||||
§.5 This Implementation
|
||||
|
||||
Run with:
|
||||
|
||||
python -m joy
|
||||
|
||||
joypy
|
||||
Thun
|
||||
|-- COPYING - license
|
||||
|-- README - this file
|
||||
|
|
||||
|
|
@ -208,6 +180,10 @@ joypy
|
|||
| |-- pretty_print.py - convert Joy datastructures to text
|
||||
| `-- stack.py - work with stacks
|
||||
|
|
||||
|-- thun - Experimental Prolog Code
|
||||
| |-- compiler.pl - A start on a compiler for Prof. Wirth's RISC CPU
|
||||
| `-- thun.pl - An interpreter in the Logical Paradigm, compiler.
|
||||
|
|
||||
`-- setup.py
|
||||
|
||||
|
||||
|
|
|
|||
265
thun/compiler.pl
265
thun/compiler.pl
|
|
@ -1,6 +1,6 @@
|
|||
/*
|
||||
/*
|
||||
|
||||
Copyright © 2018-2019 Simon Forman
|
||||
Copyright © 2018 Simon Forman
|
||||
|
||||
This file is part of Thun
|
||||
|
||||
|
|
@ -17,27 +17,6 @@ GNU General Public License for more details.
|
|||
You should have received a copy of the GNU General Public License
|
||||
along with Thun. If not see <http://www.gnu.org/licenses/>.
|
||||
|
||||
The Joy interpreter that this implements is pretty crude. the only types
|
||||
are 16-bit integers and linked lists. The lists are 32-bit words divided
|
||||
into two 16-bit fields. The high half is the node value and the low half
|
||||
points directly (not offset) to the next cell, zero terminates the list.
|
||||
|
||||
The expression is expected to be already written in RAM as a linked list at
|
||||
the time the mainloop starts. As yet there is no support for actually doing
|
||||
this. Both the new stack and expression cells are written to the same heap
|
||||
intermixed. The stack and expression pointers never decrease, the whole
|
||||
history of the computation is recorded in RAM. If the computation of the
|
||||
expression overruns the end of RAM (or 16-bits whichever comes first) the
|
||||
machine crashes.
|
||||
|
||||
At the moment, functions are recognized by setting high bit, but I don't
|
||||
think I remembered to set the bits during compilation, so it won't work
|
||||
at all right now. Er... Boo. Anyhow, the whole thing is very crude and
|
||||
not at all what I am hoping eventually to build.
|
||||
|
||||
But it's a start, and I feel good about emitting machine code (even if the
|
||||
program doesn't do anything useful yet.)
|
||||
|
||||
*/
|
||||
:- use_module(library(assoc)).
|
||||
:- use_module(library(clpfd)).
|
||||
|
|
@ -57,87 +36,23 @@ do :- Program = [
|
|||
compile_program(Program, Binary),
|
||||
write_binary('joy_asm.bin', Binary).
|
||||
|
||||
% phrase(pass0(Program, AST), [], _),
|
||||
% write_canonical(AST),
|
||||
% phrase(⟐(AST), IR),
|
||||
% write_canonical(IR),
|
||||
% phrase(linker(IR), ASM),
|
||||
% write_canonical(ASM).
|
||||
|
||||
compile_program(Program, Binary) :-
|
||||
phrase((init, ⦾(Program, IR)), [], [Context]),
|
||||
phrase(⟐(IR), ASM),
|
||||
phrase(linker(ASM), EnumeratedASM),
|
||||
foo(Context),
|
||||
phrase(asm(EnumeratedASM), Binary).
|
||||
pass0(Code, Program) --> init, ⦾(Code, Program).
|
||||
|
||||
foo(Context) :-
|
||||
get_assoc(dictionary, Context, D),
|
||||
assoc_to_list(D, Dictionary),
|
||||
portray_clause(Dictionary).
|
||||
|
||||
|
||||
/*
|
||||
|
||||
This first stage ⦾//2 converts the Joy description into a kind of intermediate
|
||||
representation that models the Joy interpreter on top of the machine but doesn't
|
||||
actually use assembly instructions. It also manages the named registers and
|
||||
memory locations so thet don't appear in the program.
|
||||
|
||||
The idea here is to extract the low-level "primitives" needed to define the Joy
|
||||
interpreter to make it easier to think about (and maybe eventually retarget other
|
||||
CPUs.)
|
||||
|
||||
*/
|
||||
init, [Context] -->
|
||||
{empty_assoc(C), empty_assoc(Dictionary),
|
||||
put_assoc(dictionary, C, Dictionary, Context)}.
|
||||
|
||||
⦾([], []) --> [].
|
||||
|
||||
⦾([ヲ|Terms], Ts) --> % Preamble.
|
||||
% Initialize context/state/symbol table.
|
||||
set(dict_ptr, 11), % Reg 11 is a pointer used during func lookup.
|
||||
set(dict_top, 12), % Reg 12 points to top of dictionary.
|
||||
set(dict, 0), % Address of top of dict during compilation.
|
||||
set(done, _DONE), % DONE label (logic variable.)
|
||||
set(expr, 4), % Reg 4 points to expression.
|
||||
set(halt, _HALT), % HALT label (logic variable.)
|
||||
set(main, _MAIN), % MAIN label (logic variable.)
|
||||
set(reset, _Reset), % Reset label (logic variable.)
|
||||
set(sp, 2), % Reg 2 points to just under top of stack.
|
||||
set(temp0, 6), % Reg 6 is a temp var.
|
||||
set(temp1, 7), % Reg 7 is a temp var.
|
||||
set(temp2, 8), % Reg 8 is a temp var.
|
||||
set(temp3, 9), % Reg 9 is a temp var.
|
||||
set(term, 5), % Reg 4 holds current term.
|
||||
set(tos, 3), % Reg 3 holds Top of Stack.
|
||||
⦾(Terms, Ts).
|
||||
|
||||
⦾([ヵ|Terms], [ % Initialization.
|
||||
jump(Over), % Oberon bootloader writes MemLim to RAM[12] and
|
||||
asm(allocate(_, 16)), % stackOrg to RAM[24], we don't need these
|
||||
label(Over), % but they must not be allowed to corrupt our code.
|
||||
set_reg_const(0, 0), % zero out the root cell.
|
||||
write_ram(0, 0),
|
||||
set_reg_const(SP, 0x1000),
|
||||
set_reg_const(EXPR, 0x500),
|
||||
set_reg_label(DICT_TOP, LastWord),
|
||||
set_reg_const(TOS, 0),
|
||||
set_reg_const(TERM, 0),
|
||||
asm(store_word(TOS, SP, 0)) % RAM[SP] := 0
|
||||
|Ts]) -->
|
||||
get([dict_top, DICT_TOP, expr, EXPR, sp, SP, term, TERM, tos, TOS]),
|
||||
⦾(Terms, Ts), get(dict, LastWord).
|
||||
|
||||
⦾([メ|Terms], [ % Mainloop.
|
||||
label(MAIN),
|
||||
if_zero(EXPR, HALT),
|
||||
deref(EXPR),
|
||||
split_word(TERM, EXPR),
|
||||
if_literal(TERM, PUSH),
|
||||
lookup(DICT_PTR, DICT_TOP, TERM, HALT), % Jump to command or if not found halt.
|
||||
label(PUSH), push(TOS, TERM, SP), % stack = TERM, stack
|
||||
label(DONE), write_ram(SP, TOS), % RAM[SP] := TOS
|
||||
jump(MAIN)
|
||||
|Ts]) -->
|
||||
get([dict_ptr, DICT_PTR, dict_top, DICT_TOP, done, DONE, expr, EXPR,
|
||||
halt, HALT, main, MAIN, sp, SP, term, TERM, tos, TOS]),
|
||||
⦾(Terms, Ts).
|
||||
|
||||
⦾([Body, ≡(NameAtom)|Terms], [defi(Name, B, Prev, I, SP, TOS)|Ts]) -->
|
||||
get(dict, Prev), set(dict, Name), get([sp, SP, tos, TOS]),
|
||||
get(dict, Prev), set(dict, Name), get(sp, SP), get(tos, TOS),
|
||||
inscribe(NameAtom, Name), ⦾(Terms, Ts), lookup(i, I), lookup(Body, B).
|
||||
|
||||
⦾([Body, ヮ(NameAtom)|Terms], [definition(Name, DONE, B, Prev)|Ts]) -->
|
||||
|
|
@ -154,6 +69,48 @@ CPUs.)
|
|||
⦾([P, B, ヱ|Terms], [repeat_until(Predicate, Body)|Ts]) -->
|
||||
⦾([P, B, Terms], [Predicate, Body, Ts]).
|
||||
|
||||
⦾([ヲ|Terms], Ts) --> % Preamble.
|
||||
set(dict, 0), set(done, _DONE),
|
||||
set(temp0, 6), set(temp1, 7),
|
||||
set(temp2, 8), set(temp3, 9),
|
||||
set(tos, 3), set(sp, 2), set(expr, 4), set(term, 5),
|
||||
set(dict_top, 12), set(dict_ptr, 11),
|
||||
set(halt, _HALT), set(main, _MAIN), set(reset, _Reset),
|
||||
⦾(Terms, Ts).
|
||||
|
||||
⦾([ヵ|Terms], [ % Initialization.
|
||||
jump(Over), % Oberon bootloader writes MemLim to RAM[12] and
|
||||
asm(allocate(_, 16)), % stackOrg to RAM[24], we don't need these
|
||||
label(Over), % but they must not be allowed to corrupt our code.
|
||||
set_reg_const(0, 0), % zero out the root cell.
|
||||
write_ram(0, 0),
|
||||
set_reg_const(SP, 0x1000),
|
||||
set_reg_const(EXPR, 0x500),
|
||||
set_reg_label(DICT_TOP, LastWord),
|
||||
set_reg_const(TOS, 0),
|
||||
set_reg_const(TERM, 0),
|
||||
asm(store_word(TOS, SP, 0)) % RAM[SP] := 0
|
||||
|Ts]) -->
|
||||
get(dict_top, DICT_TOP), get(expr, EXPR),
|
||||
get(sp, SP), get(term, TERM), get(tos, TOS),
|
||||
⦾(Terms, Ts), get(dict, LastWord).
|
||||
|
||||
⦾([メ|Terms], [ % Mainloop.
|
||||
label(MAIN),
|
||||
if_zero(EXPR, HALT),
|
||||
deref(EXPR),
|
||||
split_word(TERM, EXPR),
|
||||
if_literal(TERM, PUSH),
|
||||
lookup(DICT_PTR, DICT_TOP, TERM, HALT), % Jump to command or if not found halt.
|
||||
label(PUSH), push(TOS, TERM, SP), % stack = TERM, stack
|
||||
label(DONE), write_ram(SP, TOS), % RAM[SP] := TOS
|
||||
jump(MAIN)
|
||||
|Ts]) -->
|
||||
get(done, DONE), get(main, MAIN), get(halt, HALT),
|
||||
get(dict_ptr, DICT_PTR), get(dict_top, DICT_TOP), get(expr, EXPR),
|
||||
get(sp, SP), get(term, TERM), get(tos, TOS),
|
||||
⦾(Terms, Ts).
|
||||
|
||||
⦾([Term|Terms], [T|Ts]) --> ⦾(Term, T), ⦾(Terms, Ts).
|
||||
|
||||
⦾(∅, dw(0)) --> [].
|
||||
|
|
@ -185,23 +142,6 @@ CPUs.)
|
|||
⦾(ナ, low_half(TEMP0, TOS)) --> get(temp0, TEMP0), get(tos, TOS).
|
||||
⦾(ヶ, low_half(TOS, SP)) --> get(sp, SP), get(tos, TOS).
|
||||
|
||||
|
||||
/*
|
||||
|
||||
Context (state) manipulation for the ⦾//2 relation.
|
||||
|
||||
Association lists are used to keep a kind of symbol table as well as a dictionary
|
||||
of name atoms to address logic variables for defined Joy functions.
|
||||
|
||||
*/
|
||||
|
||||
init, [Context] -->
|
||||
{empty_assoc(C), empty_assoc(Dictionary),
|
||||
put_assoc(dictionary, C, Dictionary, Context)}.
|
||||
|
||||
get([]) --> !.
|
||||
get([Key, Value|Ts]) --> !, get(Key, Value), get(Ts).
|
||||
|
||||
get(Key, Value) --> state(Context), {get_assoc(Key, Context, Value)}.
|
||||
set(Key, Value) --> state(ContextIn, ContextOut),
|
||||
{put_assoc(Key, ContextIn, Value, ContextOut)}.
|
||||
|
|
@ -219,14 +159,6 @@ lookup(NameAtom, Label) --> state(Context),
|
|||
state(S), [S] --> [S].
|
||||
state(S0, S), [S] --> [S0].
|
||||
|
||||
|
||||
/*
|
||||
|
||||
This second stage ⟐//1 converts the intermediate representation to assembly
|
||||
language.
|
||||
|
||||
*/
|
||||
|
||||
⟐([]) --> [].
|
||||
⟐([Term|Terms]) --> ⟐(Term), ⟐(Terms).
|
||||
|
||||
|
|
@ -338,57 +270,46 @@ language.
|
|||
|
||||
⟐(pop(Reg, TOS)) --> ⟐([split_word(Reg, TOS), deref(TOS)]).
|
||||
|
||||
|
||||
/*
|
||||
|
||||
Support for ⟐//1 second stage.
|
||||
|
||||
The dexpr//2 DCG establishes a sequence of labeled expr_cell/2 pseudo-assembly
|
||||
memory locations as a linked list that encodes a Prolog list of Joy function
|
||||
labels comprising e.g. the body of some Joy definition.
|
||||
|
||||
*/
|
||||
|
||||
dexpr([], 0) --> [].
|
||||
dexpr([Func|Rest], ThisCell) -->
|
||||
[label(ThisCell), expr_cell(Func, NextCell)],
|
||||
dexpr(Rest, NextCell).
|
||||
|
||||
/*
|
||||
|
||||
The add_label/3 relation is a meta-logical construct that accepts a comparision
|
||||
predicate (e.g. if_zero/2) and "patches" it by adding the Label logic variable
|
||||
to the end.
|
||||
|
||||
*/
|
||||
|
||||
add_label(CmpIn, Label, CmpOut) :-
|
||||
CmpIn =.. F,
|
||||
append(F, [Label], G),
|
||||
CmpOut =.. G.
|
||||
|
||||
/*
|
||||
|
||||
Two simple masking predicates.
|
||||
|
||||
*/
|
||||
|
||||
high_half_word(I, HighHalf) :- HighHalf is I >> 16 /\ 0xFFFF.
|
||||
low_half_word( I, LowHalf) :- LowHalf is I /\ 0xFFFF.
|
||||
|
||||
compile_program(Program, Binary) :-
|
||||
phrase(pass0(Program, AST), [], _),
|
||||
phrase(⟐(AST), IR),
|
||||
phrase(linker(IR), ASM),
|
||||
phrase(asm(ASM), Binary).
|
||||
|
||||
/*
|
||||
|
||||
| o._ | _ ._
|
||||
|_|| ||<(/_|
|
||||
|
||||
Linker
|
||||
|
||||
*/
|
||||
Logical variables for addresses are unified with the actual locations and
|
||||
each instruction is paired with its address.
|
||||
|
||||
linker(ASM) --> enumerate_asm(ASM, 0, _).
|
||||
linker//1 unparses a list of the "IR" that ⟐//1 emits and in turn emits a
|
||||
list of (Address, Instruction) pairs for assembly instructions.
|
||||
|
||||
*/
|
||||
|
||||
linker(IntermediateRepresentation) --> enumerate_asm(IntermediateRepresentation, 0, _).
|
||||
|
||||
enumerate_asm( [], N, N) --> !, [].
|
||||
enumerate_asm( [Term|Terms], N, M) --> !, enumerate_asm(Term, N, O), enumerate_asm(Terms, O, M).
|
||||
enumerate_asm( label(N) , N, N) --> !, [].
|
||||
enumerate_asm(allocate(N, Bytes), N, M) --> !, {Bits is 8 * Bytes}, [skip(Bits)], {align(N, Bytes, M)}.
|
||||
enumerate_asm(allocate(N, Bytes), N, M) --> !, [skip(Bits)], {align(N, Bytes, M), Bits is 8 * Bytes}.
|
||||
enumerate_asm( Instr, N, M) --> [(Z, Instr)], {align(N, 0, Z), align(Z, 4, M)}.
|
||||
|
||||
align(_, Bytes, _) :- (Bytes < 0 -> write('Align negative number? No!')), !, fail.
|
||||
|
|
@ -396,12 +317,19 @@ align(N, 1, M) :- !, M is N + 1.
|
|||
align(N, Bytes, M) :- N mod 4 =:= 0, !, M is N + Bytes.
|
||||
align(N, Bytes, M) :- Padding is 4 - (N mod 4), M is N + Bytes + Padding.
|
||||
|
||||
|
||||
/*
|
||||
|
||||
/\ _ _ _ ._ _ |_ | _ ._
|
||||
/--\_>_>(/_| | ||_)|(/_|
|
||||
|
||||
Assembler
|
||||
|
||||
*/
|
||||
The asm//1 DCG unparses a list of (Address, Instruction) pairs into a list
|
||||
of 0 or 1 ints representing the binary bits of the machine code. The actual
|
||||
work is done by asm//2 which uses instruction_format_Fn//n DCGs to generate
|
||||
the bits.
|
||||
|
||||
*/
|
||||
|
||||
asm([]) --> !, [].
|
||||
asm([ skip(Bits)|Rest]) --> !, skip(Bits), asm(Rest).
|
||||
|
|
@ -538,7 +466,18 @@ asm(N, vs_offset(Label)) --> instruction_format_F3b(0, vs, Label, N).
|
|||
asm(_, vs_link(C)) --> instruction_format_F3a(1, vs, C).
|
||||
asm(N, vs_link_offset(Label)) --> instruction_format_F3b(1, vs, Label, N).
|
||||
|
||||
% This is the core of the assembler where the instruction formats are specified.
|
||||
/*
|
||||
___ _
|
||||
| ._ __|_._ __|_o _ ._ |__ ._._ _ _._|_ _
|
||||
_|_| |_> |_||_|(_ |_|(_)| | |(_)| | | |(_| |__>
|
||||
|
||||
This is the core of the assembler where the instruction formats are
|
||||
specified. Each one is a 32-bit word.
|
||||
|
||||
The various jump-to-offset instructions use the addresses to compute the
|
||||
jump offsets. Offsets and immediate values are converted to binary.
|
||||
|
||||
*/
|
||||
|
||||
instruction_format_F0(U, A, B, Op, C ) --> [0, 0, U, 0], reg(A), reg(B), operation(Op), skip(12), reg(C).
|
||||
instruction_format_F1(U, V, A, B, Op, Im) --> [0, 1, U, V], reg(A), reg(B), operation(Op), immediate(Im).
|
||||
|
|
@ -620,6 +559,19 @@ invert([], []).
|
|||
invert([1|Tail], [0|Lait]) :- invert(Tail, Lait).
|
||||
invert([0|Tail], [1|Lait]) :- invert(Tail, Lait).
|
||||
|
||||
|
||||
/*
|
||||
_
|
||||
|_)o._ _.._ |\ | ._ _ |_ _ .__
|
||||
|_)|| |(_||\/ | \||_|| | ||_)(/_|_>
|
||||
/
|
||||
|
||||
binary_number(ListOfBits, Integer)
|
||||
|
||||
twos_compliment(ListOfBits, Integer, NumberOfBits)
|
||||
|
||||
*/
|
||||
|
||||
twos_compliment(Bits, Number, Width) :-
|
||||
X is abs(Number),
|
||||
binary_number(B, X),
|
||||
|
|
@ -651,8 +603,8 @@ canonical_binary_number([1|Bits], Number):-
|
|||
canonical_binary_number(Bits1, Number1),
|
||||
Number is Number1 + 2 ^ Pow.
|
||||
|
||||
binary_number( Bits , Number) :- canonical_binary_number(Bits, Number).
|
||||
binary_number([0|Bits], Number) :- binary_number(Bits, Number).
|
||||
binary_number( Bits , Number) :- canonical_binary_number(Bits, Number).
|
||||
|
||||
|
||||
% Helper code to write the list of bits as a binary file.
|
||||
|
|
@ -668,8 +620,7 @@ for_serial(Binary, Ser) :-
|
|||
|
||||
write_binary(Name, Binary) :-
|
||||
open(Name, write, Stream, [type(binary)]),
|
||||
for_serial(Binary, Ser),
|
||||
phrase(write_binary_(Stream), Ser),
|
||||
phrase(write_binary_(Stream), Binary),
|
||||
close(Stream).
|
||||
|
||||
write_binary_(Stream) -->
|
||||
|
|
|
|||
Loading…
Reference in New Issue