BadVPN – Blame information for rev 1
?pathlinks?
Rev | Author | Line No. | Line |
---|---|---|---|
1 | office | 1 | Lime: An LALR(1) parser generator in and for PHP. |
2 | |||
3 | Interpretter pattern got you down? Time to use a real parser? Welcome to Lime. |
||
4 | |||
5 | If you're familiar with BISON or YACC, you may want to read the metagrammar. |
||
6 | It's written in the Lime input language, so you'll get a head-start on |
||
7 | understanding how to use Lime. |
||
8 | |||
9 | 0. If you're not running Linux on an IA32 box, then you will have to rebuild |
||
10 | lime_scan_tokens for your system. It should be enough to erase it, |
||
11 | and then type "CFLAGS=-O2 make lime_scan_tokens" at the bash prompt. |
||
12 | |||
13 | 1. Stare at the file lime/metagrammar to understand the syntax. You're seeing |
||
14 | slightly modified and tweaked Backus-Naur forms. The main differences |
||
15 | are that you get to name your components, instead of refering to them |
||
16 | by numbers the way that BISON demands. This idea was stolen from the |
||
17 | C-based "Lemon" parser from which Lime derives its name. Incidentally, |
||
18 | the author of Lemon disclaimed copyright, so you get a copy of the C |
||
19 | code that taught me LALR(1) parsing better than any book, despite the |
||
20 | obvious difficulties in understanding it. Oh, and one other thing: |
||
21 | symbols are terminal if the scanner feeds them to the parser. They |
||
22 | are non-terminal if they appear on the left side of a production rule. |
||
23 | Lime names semantic categories using strings instead of the numbers |
||
24 | that BISON-based parsers use, so you don't have to declare any list of |
||
25 | terminal symbols anywhere. |
||
26 | |||
27 | 2. Look at the file lime/lime.php to see what pragmas are defined. To be more |
||
28 | specific, you might look at the method lime::pragma(), which at the |
||
29 | time of this writing, supports "%left", "%right", "%nonassoc", |
||
30 | "%start", and "%class". The first three are for operator precedence. |
||
31 | The last two declare the start symbol and the name of a PHP class to |
||
32 | generate which will hold all the bottom-up parsing tables. |
||
33 | |||
34 | 3. Write a grammar file. |
||
35 | |||
36 | 4. php /path/to/lime/lime.php list-of-grammar-files > my_parser.php |
||
37 | |||
38 | 5. Read the function parse_lime_grammar() in lime.php to understand |
||
39 | how to integrate your parser into your program. |
||
40 | |||
41 | 6. Integrate your parser as follows: |
||
42 | |||
43 | --------------- CUT --------------- |
||
44 | |||
45 | include_once "lime/parse_engine.php"; |
||
46 | include_once "my_parser.php"; |
||
47 | # |
||
48 | # Later: |
||
49 | # |
||
50 | $parser = new parse_engine(new my_parser()); |
||
51 | # |
||
52 | # And still later: |
||
53 | # |
||
54 | try { |
||
55 | while (..something..) { |
||
56 | $parser->eat($type, $val); |
||
57 | # You figure out how to get the parameters. |
||
58 | } |
||
59 | # And after the last token has been eaten: |
||
60 | $parser->eat_eof(); |
||
61 | } catch (parse_error $e) { |
||
62 | die($e->getMessage()); |
||
63 | } |
||
64 | return $parser->semantic; |
||
65 | |||
66 | --------------- CUT --------------- |
||
67 | |||
68 | 7. You now have the computed semantic value of whatever you parsed. Add salt |
||
69 | and pepper to taste, and serve. |
||
70 |