diff options
Diffstat (limited to 'doc/macro-compile-parse.txt')
-rw-r--r-- | doc/macro-compile-parse.txt | 240 |
1 files changed, 0 insertions, 240 deletions
diff --git a/doc/macro-compile-parse.txt b/doc/macro-compile-parse.txt deleted file mode 100644 index 9c79f6c68..000000000 --- a/doc/macro-compile-parse.txt +++ /dev/null @@ -1,240 +0,0 @@ -the whole Parse chain is generally set in motion via - -SbiModule::Compile - while ( SbiParser::Parse() ) - -for the purpose of example consider the following module code - -e.g. -line: content -------------- -1 REM ***** BASIC ***** -2 -3 Sub Main -4 n = 52 -5 cMAX = 10.234 -6 Dim mRangeArray(0, 0) as String -7 ReDim mRangeArray(CInt(cMAX), n) as String -8 'msgbox ( ("here"), cInt(2) , "foo" ) -9 End Sub - - -this compiles into the following pcode - -SbiRuntime::StepSTMNT (3, 0) -SbiRuntime::StepSTMNT (4, 0) -SbiRuntime::StepFIND (2, 12) -SbiRuntime::StepLOADI (52) -SbiRuntime::StepPUT -SbiRuntime::StepSTMNT (5, 0) -SbiRuntime::StepFIND (3, 12) -SbiRuntime::StepLOADNC (4) -SbiRuntime::StepPUT -SbiRuntime::StepSTMNT (6, 0) -SbiRuntime::StepLOCAL (5, 8) -SbiRuntime::StepARGC -SbiRuntime::StepLOADI (0) -SbiRuntime::StepBASED (0) -SbiRuntime::StepARGV -SbiRuntime::StepARGV -SbiRuntime::StepLOADI (0) -SbiRuntime::StepBASED (0) -SbiRuntime::StepARGV -SbiRuntime::StepARGV -SbiRuntime::StepFIND (32773, 8) -SbiRuntime::StepDIM -SbiRuntime::StepSTMNT (7, 0) -SbiRuntime::StepFIND (5, 8) -SbiRuntime::StepERASE -SbiRuntime::StepARGC -SbiRuntime::StepARGC -SbiRuntime::StepFIND (3, 12) -SbiRuntime::StepARGV -SbiRuntime::StepRTL (32774, 2) -SbiRuntime::StepBASED (0) -SbiRuntime::StepARGV -SbiRuntime::StepARGV -SbiRuntime::StepFIND (2, 12) -SbiRuntime::StepBASED (0) -SbiRuntime::StepARGV -SbiRuntime::StepARGV -SbiRuntime::StepFIND (32773, 8) -SbiRuntime::StepREDIM -SbiRuntime::StepSTMNT (9, 0) -SbiRuntime::StepLEAVE - -REM and its line contents are practically ignored as they are not compilable content, ditto the blank lines, the Next significate line to be processed is - -Sub Main -which initially yields the SUB token, SUB is processed as follows - -where -SbiParser::Parse - SbiParser::Peek() - tests for end of file - if true ( generates code ( JMP 0 ) for that ) and returns FALSE to terminate the compile - test for end of line - calls Next() and return TRUE ( indicating more stuff to parse ) - -the main SbiParser::Parse function sets detects the main statements to process like function/subroutine etc. ( keywords are picked up from StmntTable in parser.cxx, note in addition to the Statements the appropriate handlers (functions) are also defined e.g. SbiParser::SubFunc ) - A SUB, FUNCTION or PROPERY results in a STATEMENT getting generated - -SbiParser::SubFunc - SbiParser::DefProc - create a SbiProcdecl by calling SbiParser::ProcDecl - a new SbiProcDef is created to hold procedure related info e.g. ( scope ( static, public etc. ) params defined for the procedure, local variables, return type ) - if public the SbiProcDecl instance is added to aPublics ( list of public procedures ) - SbiParser::pProc is set to the current procedure - OpenBlock() is called ( with SUB token ) - creates a SbiParseStack ( and sets SbiParser::pStack to that ) - StmntBlock( ENDSUB ) - does a while loop on Parse ( e.g. recursive call ) - SbiParse::Parse() -1st iter -========= - processes the tokens e.g. like line 4 "n = 52" - the first token to be read is n which doesn't match anything token is deemed to be a SYMBOL ( e.g. to be resolved later or at runtime, basically a SYMBOL is a lhs arg or a procedure call ) - a SYMBOL is treated as follows - strange Next()/Push() call combination where Next() returns the previously "Peeked" SYMBOL, followed by a Push() to ensure Next() will be set up to return SYMBOL again - a STATEMENT is generated - SbiParser::Symbol() is called - which creates a SbiExpression aVar( this, SbSYMBOL ) [1] - peeks at the next symbol to decide whether this is a procedure/object call or EQ ( it is in this case EQ ) - aVar.Gen( eRecMode ) ( from above ) is called the value of eRecMode refects whether this is EQ or call - - calls SbiExprNode::GenElement( SbiOpcode eOp )(..) which generates a eOp _FIND e.g. ( StepFIND( id, type ) ) - pGen->Gen( _FIND.... ) where pGen is a SbiCodeGen - note: first call of SbiCodeGen::Gen will generate a statement ( STMNT ) prior ti the desired type ( so in fact this will generates StepSTMNT(..) followed by StepFIND - the above generates the lhs of the assignment - for the rhs another SbiExpression is created e.g. SbiExpression aExpr(this) - note the lack a SbSYMBOL in the ctor [3] - aExpr.Gen() is called ( which generates _NUMBER with 52 ) - and then a _PUT is forced -2nd iter -======== - processes newline -3rd iter -======== - processes cMAX = 10.234 - the processing is identical to the previous processing for "n = 52" -4th iter -======== - - Dim mRangeArray(0, 0) as String - - fixed token to be consumed is of course the Dim - from the StmntTable we know SbiParser::DIM - First a STMNT is generated - then SbiParser::Dim() - calls SbiParser::DefVar() - calls SbiParser::VarDecl - if the peeked token is a LPARAM then - creates pDim = new SbiDimList( this ); - loops around each comma delimited expression ( and creates a SbiExpression for those ) - call TypeDecl to see if there is a 'as something' following the variable declaration - add the SbiSymDef returned from SbiParser::VarDecl to the localvariables - another STMNT is emmited - An expression is created from the SbiSymDef returned from SbiParser::VarDecl ( no further parsing here, it's already done, it just sets up the info in the SbiExpression to allow the appropriate p-code to be emitted ) - calling the Gen() on the the SbiExrNode recusively calls each SbiExprNode via aVar.pNext ( where aVar is essentially struture containing a linked list of more nodes, paramaters ( SbiExprList ( ARGC/ARGV ) ) & a symbol definition. - -this essentially results in the following pcode generated for the line above - -SbiRuntime::StepLOCAL (5, 8) aVar.pNext->Gen() ( I think ) -SbiRuntime::StepARGC aVar.pVar->Gen() - results in the SbiExprList->Gen() -SbiRuntime::StepLOADI (0) -SbiRuntime::StepBASED (0) -SbiRuntime::StepARGV -SbiRuntime::StepARGV -SbiRuntime::StepLOADI (0) -SbiRuntime::StepBASED (0) -SbiRuntime::StepARGV -SbiRuntime::StepARGV -SbiRuntime::StepFIND (32773, 8) -SbiRuntime::StepDIM - -5th iter ---------- -7: ReDim mRangeArray(CInt(cMAX), n) as String - - this follows nearly the same path as above with the exception that a secodary sub-parsing happens for CInt(cMAX) which results in some extra processing for the RTL function ( and it's paramaters ) -[1] -SbiExpression() this class is passed an instance of the parser and all of the work is done in the ctor, - basically depending of the type passed either - SbiExpression::Term() or SbiExpression::Boolean() are called which return a SbiExprNode ( ) - SbiExpression::Term() - can handle a following '.' DOT and associated processing or.. - locks the column ( why? ) pParser->LockColumn - Peeks at the following TOKEN - if the peeked token == ASSIGN - unlocks the column if its and Assign and returns a new SbiExprNode( represents a single or list of expressions left to right *I think* ) - if its a KeyWord and it's compatible and == INPUT then INPUT is turned into SYMBOL and returned, otherwise an Error is generated ( looks like INPUT has been filtered out here ) - check with other compat filtering to see why here, is this useful for filtering out other vba specific bits? - the is a check to see if parameters follow - if so then SbiParameters are created ( which can create further SbiExpressions etc. ) - see if the current symbol ( "n" ) at the stage is in the pParser->pPool or if its in the RTL library e.g. pParser->CheckRTLForSym( aSym, eType ); - if we don't have a symbol definition for the symbol then - create a new symbol definition via AddSym[2] - create a new SbiExprNode - the SbiExprNode always has a valid instance of SbiParameters associated with it ( event if there are no paramaters ) so a new SbiParameters is created - column is unlocked and the SbiNode created previously is returned - - -[2] -A symbol definintion is defined by the SbiSymDef class, it of course defines the name of the symbol, it's type, it's scope, and can of course also define it's own symbols ( e.g. if the symbol is a procedure ) - -[3] -SbiExpression::SbiExpression calles - SbiExpression::Boolean() is called for a Standard expression - SbiExpNode pNd is created for from SbiExpression::Like() - SbiExpression::Like() calls SbiExprNode* pNd = Cat(), which inturn calls AddSub, Mod, IntDiv, MulDiv, Exp, Unary, - in the case of "n = 52" in Unary() pNd is set to result of Operand() - Operand() - just returns a new SbiExpression for a simple number, result falls back through each of the previous AddSub(), Mod(), IntDiv() etc. calls - - -The Tokenizer -============= - -The tokenizer is a strange beast that operates with - -Next() & Peek() semantics, - -there is no point talking about how the strings in the module are processed, suffice to say the SbiParser has the following inheritance - -SbiScanner - | -SbiTokenizer - | -SbiParser - -where SbiScanner does the hard lifting wrt parsing the raw lines into Symbols - -Peek() -====== - -you would think would return the next symbol ( lookahead ) and it *sortof* works like this - -if ePush IS NIL then ePush is set to result of Next() ( what sets ePush ) - -eCurTok is allways set to ePush ( this is the strange bit for me ) -eCurTok is returned - -so it seems if you call Peek() you will get the Next() token only if you didn't call Peek() previously - - -Next() -====== -basically if Peek() was called previously ( and then Next() returns the value previoulsy 'Peeked' at ( and sets up ePush to NIL to force either next call to Peek() or next call to Next() to read the next token ) - what makes this strange for me is the fact that Peek() modifies the current token ( e.g. eCurTok ) - - -if ePush ISN'T NIL ( NIL is the initial and terminating state ) eCurTok is set to ePush and ePush is set to NIL - -otherwise is gets the next symbol ( via NextSym() which populates aSym ) -aSym is compared with the tokens in pTokTable - pTokTable contains keywords and symbols like ( &,+,And,Goto,If etc... ) { populated from aTokTable_Basic } -when aSym matches something from pTokTable then it is handled in the 'special' labled code branch ( urky ) c/c++ mix - 'special' branch can cause recursion if the symbol is 'LINE' or 'END' ( this is so END FUNCTION, END SUB, LINE INPUT etc. will be resolved as single tokens - if the aSym is neither LINE or END then some further magic happens like - * detecting datatype symbols e.g. INTEGER, DOUBLE, OBJECT etc. - * detecting As - * suppressing compatibility only tokens ( like CLASSMODULE, ENUM etc. ) - these are supressed by returning the SYMBOL token - |