Extended ALGOL compiler design notes.
Extended ALGOL compiler design notes. Header comments extracted from AL1T1.SR and AL2T1.SR.
Size 11.6 kB - File type text/plainFile contents
; /******************************************************************** ; * * ; * @@@ @ @@@ @@@ @ * ; * @ @ @ @ @ @ @ @ * ; * @ @ @ @ @ @ @ * ; * @@@@@ @ @ @ @ @ * ; * @ @ @ @ @@@ @ @ @ * ; * @ @ @ @ @ @ @ @ * ; * @ @ @@@@@ @@@ @@@ @@@@@ * ; * * ; ********************************************************************/ ; TWO-PASS ALGOL ; REFERENCES: ; NAUR, P., ET AL.: REPORT ON THE ALGORITHMIC LANGUAGE ALGOL 60, ; COMM. ACM, 3(5): 299-314 (1960). ; ; EVANS, A., JR.: AN ALGOL 60 COMPILER, PAPER PRESENTED AT THE ; 18TH ANNUAL MEETING OF THE ACM, DENVER, 1963. ; ; GRAHAM, R. M.: BOUNDED CONTEXT TRANSLATION, PROCEEDINGS OF THE ; SPRING JOINT COMPUTER CONFERENCE, 1964, PP. 17-19. ; ; ARDEN, B. W., AND GRAHAM, R. M.: THE INTERNAL ORGANIZATION OF THE ; MAD TRANSLATOR, COMM. ACM, 4(1): 28-31 (1961). ; ; WARSHALL, S., AND SHAPIRO, R. M.: A GENERAL-PURPOSE TABLE-DRIVEN ; COMPILER, PROCEEDINGS OF THE SPRING JOINT COMPUTER ; CONFERENCE, 1964, PP. 59-65. ; ; BUSAM, V. A., AND ENGLUND, D. E., OPTIMIZATION OF EXPRESSIONS IN ; FORTRAN, COMM. ACM, 12(12): 666-674 (1969). ; DATA GENERAL'S TWO-PASS ALGOL COMPILER USES A METHOD DESCRIBED ; BY EVANS FOR BOUNDED CONTEXT TRANSLATION OF PRECEDENCE GRAMMARS. ; THE SYNTAX IS ENCODED IN A SET OF "PRODUCTIONS". THESE ARE ; SCANNED BY THE SYNTACTIC ANALYZER IN TRANSLATING THE ALGOL ; INPUT STATEMENTS INTO POSTFIX FORM. THIS INTERMEDIATE FORM ; IS USED FOR COMMON SUB-EXPRESSION ELIMINATION. IN A MULTI-PASS ; VERSION OF THE COMPILER, THE INTERMEDIATE CODE IS SCANNED TO ; REMOVE INVARIANT COMPUTATIONS FROM LOOPS. AFTER ALL OPTIMIZATION ; IS COMPLETE, THE INTERMEDIATE CODE IS PASSED TO A CODE SELECTOR ; SIMILAR TO ONE USED BY GRAHAM, GALLER AND ARDEN IN THE TRANSLATOR ; FOR MAD, WHICH USES A TABLE OF CODE PATTERNS AND MAINTAINS ; STATISTICS ON REGISTER USAGE TO MAKE GOOD USE OF THE AVAILABLE ; MACHINE INSTRUCTIONS, REGISTERS AND TEMPORARIES. ; THE SYNTACTIC ANALYZER IS OF THE BOTTOM-UP REDUCTIONS ANALYSIS ; VARIETY, WHERE THE SCAN IS DIRECTED BY A SET OF REDUCTIONS. ; EACH REDUCTION CONTAINS FOUR PARTS. THE FIRST PART IS A PATTERN ; TO MATCH AGAINST THE TOP ELEMENTS ON THE STACK. EACH ELEMENT IN ; THE PATTERN MAY MATCH A SPECIFIC TERMINAL SYMBOL OR A GENERAL ; CLASS OF SYMBOLS SUCH AS "ANY IDENTIFIER OR LITERAL". IF ANY OF ; THE ELEMENTS IN THE PATTERN DOES NOT MATCH THE STACK, THE ANALYZER ; WILL PROCEED TO THE NEXT REDUCTION. IF ALL THE ELEMENTS MATCH ; THEN THE ANALYZER WILL TRANSFORM THE MATCHED ELEMENTS AS SPECIFIED ; BY THE NEXT PART OF THE REDUCTION. EACH OF THE MATCHED ELEMENTS ; MAY BE REPLACED BY ANOTHER MATCHED ELEMENT (OR ANY STACK ELEMENT) ; OR ANY TERMINAL, OR THE MATCHED ELEMENT MAY BE DELETED. AFTER THE ; STACK IS TRANSFORMED, THE ACTION ROUTINES SPECIFIED BY THE NEXT ; PART OF THE REDUCTION ARE CALLED. THESE ROUTINES MAY PERFORM ; SUCH FUNCTIONS AS CHANGING THE TYPE OF AN IDENTIFIER. THE ; LAST PART OF THE REDUCTION SPECIFIES THE NEXT REDUCTION TO ; INTERPRET. IF NONE IS SPECIFIED, THE ANALYZER PROCEEDS TO THE ; NEXT SEQUENTIAL REDUCTION. ; ARITHMETIC EXPRESSIONS ARE ANALYZED BY AN ACTION ROUTINE, THE ; EXPRESSION ANALYZER, WHICH IS DRIVEN BY OPERATOR PRECEDENCE. ; THE EXPRESSION ANALYZER DETERMINES THE TYPE (REAL, INTEGER, ETC.) ; OF EACH SUB-EXPRESSION AND INSERTS APPROPRIATE TYPE CONVERSIONS. ; TO INCREASE THE LIKELIHOOD OF COMMON SUB-EXPRESSION REMOVAL, ; ALL MINUS SIGNS ARE MOVED TO THE HIGHEST POSSIBLE POSITION IN ; THE EXPRESSION. FOR EXAMPLE, THE EXPRESSION (-A-B) IS ENCODED ; AS -(A+B). FOR OPERATIONS THAT ARE COMMUTATIVE, THE OPERANDS ; ARE SORTED AND THE SIGNS CHANGED AS NECESSARY. ; INPUT SYMBOLS FOR THE SYNTACTIC ANALYZER AND EXPRESSION ANALYZER ; ARE RECOGNIZED BY A GENERAL-PURPOSE LEXICAL ANALYZER. ; AMBIGUITIES, SUCH AS LABELS RECOGNIZED AS INTEGER LITERALS, ARE ; HANDLED BY ACTION ROUTINES. THE STRING FORM OF TERMINAL SYMBOLS ; IS SPECIFIED IN THE CALLS TO A KEY WORD RECOGNIZER. TO ; ADD A TERMINAL SYMBOL OR OPERATOR, A CALL MUST BE ADDED IN THE ; APPROPRIATE SPOT IN THE LEXICAL ANALYZER AND AN ENTRY MUST BE ; ADDED TO THE TERMINAL SYMBOL TABLE. ; THE CODE SELECTOR IS DRIVEN BY CODE PATTERNS FOR EACH OPERATOR. ; EACH CODE PATTERN LINE MAY BE A SYMBOLIC NOVA INSTRUCTION OR A ; COMMAND TO THE CODE SELECTOR. THE SYMBOLIC INSTRUCTION MAY CONTAIN ; REFERENCES TO THE ADDRESS OF AN OPERAND OR REGISTERS CONTAINING THE ; OPERAND. THESE HAVE THE FORM "r" (REGISTER CONTAINING SPECIFIED ; OPERAND, "x" (INDEX REGISTER CONTAINING SPECIFIED OPERAND), "s" ; (OPERAND IN SPECIFIED REGISTER) OR "o" (SPECIFIED OPERAND); ; FOLLOWED BY A DIGIT INDICATING THE SPECIFIED NUMBER OR A SMALL LETTER ; PSEUDO-VARIABLE CONTAINING THE SPECIFIED NUMBER. A "t" REFERENCED ; IN THE SYMBOLIC INSTRUCTION LINE WILL BE TRANSLATED INTO A LETTER ; INDICATING INSTRUCTION TYPE, "F" FOR FLOATING POINT, OR NULL FOR ; INTEGER AND BOOLEAN. IF "o" IS REFERENCED FOR A MATRIX OPERAND, ; A STORAGE LOCATION WILL BE ASSIGNED FOR A TEMPORARY AND THIS WILL ; BE OUTPUT. ; THE COMMANDS TO THE CODE SELECTOR MAY TEST TO DETERMINE IF AN ; OPERAND IS IN A REGISTER AND TRANSFER ACCORDINGLY TO ANOTHER LINE ; IN THE CODE PATTERN. NUMERICAL VALUES MAY BE ASSIGNED TO THE ; PSEUDO-VARIABLES AND OTHER CODE PATTERNS MAY BE CALLED WITH ; THESE AS ARGUMENTS. THE CODE SELECTOR KEEPS TRACK OF REGISTER ; CONTENTS TO OPTIMIZE THE USE OF REGISTERS, INSTRUCTIONS AND ; TEMPORARIES THROUGHOUT THE PROGRAM. ; VALID CODE SELECTOR COMMANDS ARE: ; /C <PATTERN NUMBER> [<VARIABLE> <COUNTER>]/ ; CALL THE PATTERN THE SPECIFIED NUMBER OF TIMES ; PASSING THE COUNTER IN VARIABLE. ; /X [<REGISTER>]/ ; EXIT FROM A CODE PATTERN, OPTIONALLY SETTING THE ; CURRENT MATRIX VALUE AS THE CONTENTS OF THE ; SPECIFIED REGISTER. ; /J <JUMP>/ ; JUMP FORWARD IN THE CODE PATTERN BY THE AMOUNT ; SPECIFIED. ; /SV <REGISTER>/ ; /S <REGISTER> <OPERAND>/ ; /S <VARIABLE> <VARIABLE>/ ; /S <VARIABLE> <DIGIT>/ ; /S <VARIABLE> <REGISTER>/ ; SET THE STATUS OR CONTENTS OF THE SPECIFIED REGISTER, ; "V" INDICATES SET REGISTER VACANT. ; SET EQUAL, SETS TWO VARIABLES EQUAL, A VARIABLE ; EQUAL TO A NUMBER, OR A VARIABLE EQUAL TO A REGISTER ; NUMBER. ; /TV <REGISTER> <JUMP>/ ; /TM <OPERAND> <JUMP>/ ; /T <REGISTER> <JUMP>/ ; /T <VARIABLE> <VARIABLE> <JUMP>/ ; /T <REGISTER> <REGISTER> <JUMP>/ ; TEST AND JUMP IF TRUE, TESTS FOR THE SPECIFIED CONDITION, ; IF AN OPERAND IS IN ANY REGISTER, IF AN OPERAND ; IS IN A PARTICULAR REGISTER, OR IF TWO VARIABLES ; ARE EQUAL. "M" TESTS IF A REGISTER CONTAINS A MATRIX ; OPERAND. ; /F... TEST AND JUMP IF FALSE, SAME AS ABOVE. ; ; STORAGE FOR COMPILER TABLES IS COMPOSED OF FIVE WORD ELEMENTS ; WHICH ARE THREADED TOGETHER TO FORM STACKS AND VARIOUS OTHER ; DATA LISTS. EACH STACK ELEMENT HAS THE FOLLOWING FORMAT: ; ; **************** ; * .FWDP * -----> POINTER TO NEXT ELEMENT ; **************** ; * .TYPE * ELEMENT TYPE (REAL, INTEGER, BOOLEAN) ; **************** ; * .TABL * TABLE NUMBER (IDENTIFIER, LITERAL, ; **************** TERMINAL NUMBER, OR GENERATED LABEL) ; * .DATA * -----> DATA OR POINTER TO DATA ; **************** ; * .DOPE * -----> SPECIFICATION INFORMATION ; **************** ; ; ELEMENTS IN THE IDENTIFIER STACK MAY BE BLOCK HEADERS OR ; IDENTIFIER ELEMENTS. BLOCK HEADERS CONTAIN THE FOLLOWING: ; ; **************** ; * .FWDP * -----> NEXT ELEMENT IN IDENTIFIER STACK ; **************** ; * .LINK * -----> POINTER TO LAST ENTRY IN NEXT OUTER BLOCK ; **************** ; * .LAST * -----> POINTER TO LAST ENTRY OF THIS BLOCK ; **************** ; * 0 * ZERO INDICATES BLOCK HEADER ELEMENT ; **************** ; * .DOPE * -----> POINTER TO DOPE ELEMENT FOR BLOCK ; **************** ; ; THE DOPE ELEMENT FOR A BLOCK CONTAINS THE FOLLOWING: ; ; **************** ; * .FWDP * -----> POINTER TO MORE DOPE, NORMALLY ZERO ; **************** ; * .CSLC * CURRENT STORAGE LOCATION COUNTER WHEN ; **************** THIS BLOCK WAS ENTERED ; * .CCBH * -----> POINTER TO CURRENT BLOCK HEADER WHEN ; **************** THIS BLOCK ENTERED ; * .CCPH * -----> POINTER TO CURRENT PROCEDURE HEADER ; **************** ; * .BLEV * CURRENT PROCEDURE LEVEL ; **************** ; ; IDENTIFIER ELEMENTS CONTAIN THE FOLLOWING: ; ; **************** ; * .FWDP * -----> NEXT ELEMENT IN IDENTIFIER STACK ; **************** ; * .LINK * -----> NEXT ENTRY IN SEARCH THREAD, ZERO IF ; **************** END OF THREAD ; * .CCBH * -----> POINTER TO CURRENT BLOCK HEADER ; **************** ; * .DATA * -----> POINTER TO CHARACTER STRING ELEMENT ; **************** ; * .DOPE * -----> POINTER TO IDENTIFIER DOPE ELEMENT ; **************** ; ; THE CHARACTER STRING DATA ELEMENT HAS THE FOLLOWING FORMAT: ; ; **************** ; * .FWDP * -----> POINTER TO MORE STRING DATA, OR ZERO ; **************** IF END OF CHARACTER STRING ; * # OF WORDS * CHARACTER DATA, NULL CHARACTERS ARE IGNORED ; **************** ; * C1 * C2 * SECOND AND SUCCEEDING ELEMENTS HAVE ; **************** EIGHT (8) CHARACTERS PER ELEMENT. ; * C3 * C4 * ; **************** ; * C5 * C6 * ; **************** ; ; THE IDENTIFIER DOPE ELEMENT CONTAINS THE FOLLOWING INFORMATION: ; ; **************** ; * .FWDP * -----> POINTER TO ADDITIONAL DOPE, NORMALLY ZERO ; **************** ; * .TYPE * DATA TYPE, ZERO IF UNDEFINED, OTHERWISE ; **************** INTEGER, REAL, BOOLEAN, LABEL, COMPLEX ; * .CLAS * IDENTIFIER CLASS (NORMAL VARIABLE, ARRAY, ; **************** LABEL, OR PROCEDURE) ; * .ADDR * LOCATION RELATIVE TO THE BASE OF THE DATA ; **************** STORAGE AREA, ZERO IF LABEL ; * .KIND * STORAGE CLASS ; **************** ; THE TERMINAL TABLE IS INDEXED BY TERMINAL NUMBER (DATA IN STACK ELEMENT) ; MINUS 48 (ALL TERMINAL NUMBERS ARE GREATER THAN 48). EACH ENTRY IN ; THE TERMINAL TABLE CONTAINS THE FOLLOWING: ; ; **************** ; * .TTYP * TERMINAL TYPE, ZERO IS NORMAL OPERATOR ; **************** ; * .LPRC * LEFT PRECEDENCE (PRECEDENCE WHEN COMPARED TO ; **************** AN OPERATOR ON ITS RIGHT) ; * .RPRC * RIGHT PRECEDENCE ; **************** ; * .RTYP * RESULT TYPE OR -1 IF SAME AS CONVERSION ; **************** .EOT ; END OF AL1T1.SR ; ; * ; ** ; * ; * ; * ; * ; *** ; ; ALGOL PHASE 1 ZEROES ALL GLOBAL TABLES AND VARIABLES, ALLOCATES ; STORAGE FOR COMPILER WRITEABLE TABLES AND INITIALIZES GLOBAL ; TABLES AND VARIABLES WHERE REQUIRED. A LEXICAL AND SYNTACTIC ; ANALYSIS OF THE SOURCE PROGRAM IS PERFORMED, PRODUCING AN ; INTERMEDIATE CODE FOR FURTHER PROCESSING BY PHASE 2, THE CODE ; GENERATOR. THE OVERLAY FOR PHASE 2 DOES NOT DESTROY GLOBAL PAGE ; ZERO VARIABLES OR CONSTANTS, NOR DOES IT DESTROY GLOBAL TABLES ; BUILT FROM THE TOP OF AVAILABLE MEMORY. ; ** ; * * ; * ; * ; * ; * ; **** ; ; ALGOL PHASE 2 IS PASSED THE PRESERVED SYMBOL TABLES IN CORE ; AND A FILE CONTAINING INTERMEDIATE CODE IN THE FORM OF A ; CODE MATRIX. EACH ROW OF THE MATRIX CONTAINS AN OPERATOR ; FOLLOWED BY ZERO OR MORE OPERANDS WHICH MAY POINT TO OTHER MATRIX ; ROWS OR TO ELEMENTS IN THE SYMBOL TABLES. THE CODE SELECTOR ; INTERPRETS A CODE PATTERN FOR EACH OF THESE ROWS AND GENERATES ; CODE ACCORDINGLY.