The grand strategy of Ruby Parser (original) (raw)

")

Write grammar rule with BNF (Backus–N...")

LR parser
Can hand...")
large range of languages Major parser algorithm To be precise, LR-attributed grammar I believe grammar easy for human is close to LR grammar LL parser Has has less power than LR parser PEG It’s difficult to create Error Tolerant parser A rule failure doesn’t imply a parsing failure like in context free grammars

https://bugs.r...")

A lot of rules are...")
Argument is optional Parentheses around arguments are optional Block is optional (The symbol of pattern matching, `in` or `=>) Need to discuss grammar rules as group E.g. “a == b”, “1 + 2” and “1..2” are in same “arg” group If change “arg” rules, need to consider the impact on “expr” and “stmt” too

parse_conditional in...")

Joel Denny. “PSLR(1)...")
LR(1) for the Deterministic Parsing of Composite Languages”, May 2010. https://tigerprints.clemson.edu/cgi/ viewcontent.cgi?article=1519&context=all_dissertations Lukas Diekmann and Laurence Tratt. “Don’t Panic! Better, Fewer, Syntax Errors for LR Parsers”, July 2020. https:// arxiv.org/pdf/1804.07133.pdf Joe Zimmerman “Practical LR Parser Generation”, Sep 2022 https://arxiv.org/pdf/2209.08383.pdf

Yuichiro Kaneko. “Ruby...")
parser׬શʹཧղ͠ ͨ”, December 2023. https://yui-knk.hatenablog.com/entry/ 2023/12/06/082203 shioimm/coe401_. “ͨͷ͍͠RubyͷߏจղੳπΞʔ”, March 2023. https://speakerdeck.com/coe401\_/tanosiirubynogou-wen-jie-xi- tua aamine. “Rubyιʔείʔυ׬શղઆ” ୈ 2 ෦ʮߏจղੳʯ, July 2004. https://i.loveruby.net/ja/rhg/book/ [JA] https://ruby-hacking-guide.github.io/ [EN]

1965: Donald E. Knuth inve...")
parsing. “On the translation of languages from left to right” 1975: Yacc is published 1985: GNU Bison initial release 1989: Berkeley Yacc initial release 2006: GCC migrates it’s parser from Bison to hand- written recursive-descent parsers (C++ was 2004) 2015: Go migrates it’s parser from Bison to hand- written recursive-descent parsers

Bison is not perfect
The...")
hack parse.y We need more and more features Bison is not easy to enhance new features Ruby build system depends on Bison installed on your machine Lrama is installed into ruby/ruby tool directory then we can use latest features Bison is difficult to manage It was broken even though we didn't do anything when we released Ruby 2.7.7 Especially installing Bison on Windows is not easy task

")

The grand strategy of Ruby Parser
Lon...")
Provide platform for LSP and other tools Provide Universal parser Keep both Ruby grammar and parser to be maintainable Solution LR parser and parser generator are the best friends for Ruby Lrama is new foundation for Ruby parser instead of Bison

LSP
parse.y for Undergr...")

Any kinds of nodes share...")
single struct definition There is no flexibility to add new field to specific type of node It’s not straightforward to cast each field based on node type Need to change data structure from union base struct to dedicated struct for each node

Practice
Theory...")

Manage local
variables tables
and ...")

It seems good idea to integrate parser...")
lexer then change to manage states on parser side Joel E. Denny. “PSLR(1): Pseudo-Scannerless Minimal LR(1) for the Deterministic Parsing of Composite Languages”, May 2010. https://tigerprints.clemson.edu/cgi/ viewcontent.cgi?article=1519&context=all_dissertations

> Nevertheless, traditional scanner an...")
to generate loosely coupled scanners and parsers, so the user must maintain these tightly coupled scanner and parser specifications separately but consistently. > Scanner and parser specifications would be significantly more maintainable if all sub-language transitions were instead computed from a grammar by a parser generator and recognized automatically by the scanner using the parser’s stack.

")

")

")

LSP
Delete parser level
opti...")
(Node) User friendly node structure parse.y for Under graduate More declarative parser Ef fi cient data structure (Cactuses) Delete operation support Integration to parse.y More accurate recovery Parameterizing rules Replace hand written parser with Racc User de fi ned stack Scanner state update syntax Scannerless parser IELR Error tolerance Parser Generator (Lrama) Parser ✅ ✅ ✅ ✅ 💪 💪 💪 💪 Universal Parser

Ruby AST structure is m...")
GC as imemo object imemo is “Internal memo object” managed by GC Ruby’s GC is useful. It frees memory which is not used anymore Before this goal, it needs to remove objects from nodes Objects on nodes are GC marked via AST structure

“str” :sym
")

")

")

")

LSP
Delete parser level
opti...")
(Node) User friendly node structure parse.y for Under graduate More declarative parser Ef fi cient data structure (Cactuses) Delete operation support Integration to parse.y More accurate recovery Parameterizing rules Replace hand written parser with Racc User de fi ned stack Scanner state update syntax Scannerless parser IELR Error tolerance Parser Generator (Lrama) Parser ✅ ✅ ✅ ✅ 💪 💪 💪 💪

Decouple AST from
imemo
Remove...")
Refactoring Ripper LSP Delete parser level optimization Union to Struct (Node) User friendly node structure parse.y for Under graduate More declarative parser Ef fi cient data structure (Cactuses) Delete operation support Integration to parse.y More accurate recovery Parameterizing rules Replace hand written parser with Racc User de fi ned stack Scanner state update syntax Scannerless parser IELR Error tolerance Parser Generator (Lrama) Parser ✅ ✅ ✅ ✅ ✅ ✅ 💪 💪 💪 💪 💪

Decouple AST from
imemo
Remove...")
Refactoring Ripper LSP Delete parser level optimization Union to Struct (Node) User friendly node structure parse.y for Under graduate More declarative parser Ef fi cient data structure (Cactuses) Delete operation support Integration to parse.y More accurate recovery Parameterizing rules Replace hand written parser with Racc User de fi ned stack Scanner state update syntax Scannerless parser IELR Error tolerance Parser Generator (Lrama) Parser ✅ ✅ ✅ ✅ ✅ ✅ 💪 💪 💪 💪 💪

Decouple AST from
imemo
Remove...")
Refactoring Ripper LSP Optimize Node memory management Delete parser level optimization Union to Struct (Node) User friendly node structure parse.y for Under graduate More declarative parser Ef fi cient data structure (Cactuses) Delete operation support Integration to parse.y More accurate recovery Parameterizing rules Replace hand written parser with Racc User de fi ned stack Scanner state update syntax Scannerless parser IELR RBS Error tolerance Parser Generator (Lrama) Parser ✅ ✅ ✅ ✅ ✅ ✅ ✅ 💪 💪 💪 💪 💪 💪

yui-knk
Lv. 31
HP 562 MP 68
yda...")
68 ydah Lv. 30 HP 514 MP 67 junk0612 Lv. 31 HP 578 MP 64 hasumikin Lv. 29 HP 448 MP 68 S-H-GAMELINKS Lv. 28 HP 565 MP 60 Little-Rubyist Lv. 28 HP 442 MP 66 Little-Rubyist joins to the party 🎉

Decouple AST from
imemo
Remove...")
Refactoring Ripper LSP Optimize Node memory management Delete parser level optimization Union to Struct (Node) User friendly node structure parse.y for Under graduate More declarative parser Ef fi cient data structure (Cactuses) Delete operation support Integration to parse.y More accurate recovery Parameterizing rules Replace hand written parser with Racc User de fi ned stack Scanner state update syntax Scannerless parser IELR RBS Error tolerance Parser Generator (Lrama) Parser ✅ ✅ ✅ ✅ ✅ ✅ ✅ 💪 💪 💪 💪 💪 💪

We can use the latest Ruby ...")
syntax in *.rbinc source files if parse.y can be transformed to Ruby parser array.rb array.rbinc (C fi le) mk_builtin_loader.rb (ripper) + baseruby parse.y parse.rb Lrama + baseruby array.rb array.rbinc (C fi le) parse.rb + baseruby

Designe...")
Parser generator gives correct feedback Parser generator evolves Independently from grammar Programming Language Designer Grammar Parser Generator Parser Cactuses data structures Comopact data structures Panic Mode CPCT+ LALR IELR PSLR Design Input Generate Feedback Develop new features

Decouple AST from
imemo
Remove...")
Refactoring Ripper LSP Optimize Node memory management Delete parser level optimization Union to Struct (Node) User friendly node structure parse.y for Under graduate More declarative parser Ef fi cient data structure (Cactuses) Delete operation support Integration to parse.y More accurate recovery Parameterizing rules Replace hand written parser with Racc User de fi ned stack Scanner state update syntax Scannerless parser IELR RBS Error tolerance Parser Generator (Lrama) Parser

Decouple AST from
imemo
Remove...")
Refactoring Ripper LSP Optimize Node memory management Delete parser level optimization Union to Struct (Node) User friendly node structure parse.y for Under graduate More declarative parser Ef fi cient data structure (Cactuses) Delete operation support Integration to parse.y More accurate recovery Parameterizing rules Replace hand written parser with Racc User de fi ned stack Scanner state update syntax Scannerless parser IELR RBS Error tolerance Parser Generator (Lrama) Parser ✅ ✅ ✅ ✅ ✅ ✅ ✅ 💪 💪 💪 💪 💪 💪 💪

Decouple AST from
imemo
Remove...")
Refactoring Ripper LSP Optimize Node memory management Delete parser level optimization Union to Struct (Node) User friendly node structure parse.y for Under graduate More declarative parser Ef fi cient data structure (Cactuses) Delete operation support Integration to parse.y More accurate recovery Parameterizing rules Replace hand written parser with Racc User de fi ned stack Scanner state update syntax Scannerless parser IELR RBS Error tolerance Parser Generator (Lrama) Parser ✅ ✅ ✅ ✅ ✅ ✅ ✅ 💪 💪 💪 💪 💪 💪 💪

Lrama LALR (1) parser generator http...")
Parser Roadmap”, https://docs.google.com/presentation/d/ 1E4v9WPHBLjtvkN7QqulHPGJzKkwIweVfcaMsIQ984_Q Yuichiro Kaneko. “Ruby Parser։ൃ೔ࢽ (12) - LR parser generatorͷఏڙ͢Δจ๏ͷ݈શੑ”, September 2023. https://yui-knk.hatenablog.com/entry/2023/09/19/191135 Yuichiro Kaneko. “Ruby Parser։ൃ೔ࢽ (14) - LR parser׬શʹཧղͨ͠”, December 2023. https://yui-knk.hatenablog.com/entry/2023/12/06/082203 Lukas Diekmann and Laurence Tratt. “Don’t Panic! Better, Fewer, Syntax Errors for LR Parsers”, July 2020. https://arxiv.org/pdf/1804.07133.pdf Joel E. Denny. “PSLR(1): Pseudo-Scannerless Minimal LR(1) for the Deterministic Parsing of Composite Languages”, May 2010. https://tigerprints.clemson.edu/cgi/viewcontent.cgi? article=1519&context=all_dissertations