Building hierarchical tree (as opposed to flat tree)

Nov 3, 2010 at 2:07 AM
Edited Nov 3, 2010 at 2:08 AM

From my understanding (which is very limited, being a newcomer to both ANTLR and this project being my first play with it) I get the impression this project produces a flat parse tree of a cs file. I was hoping to get things organised in a hierarchical tree (whereby a namespace would be a node with children that represent classes, which in turn have child nodes that represent members (properties, methods etc).

Does this project have any capabilities to do such a thing (and if so, whereabouts)? If not, does anybody know of anything which can?

Otherwise am I right in thinking my alternative is basically (as per this question: http://stackoverflow.com/questions/1790528/understanding-trees-in-antlr ) adapting the grammar (as in, editing cs.g) to construct a tree as per this documentation: http://www.antlr.org/wiki/display/ANTLR3/Tree+construction (and, as a total newcomer to ANTLR and indeed parsing in general, this would be rather nontrivial)?

Coordinator
Nov 3, 2010 at 3:11 PM

There are only 2 things you need to do.  Add "output=AST;" to the grammar options at the top of cs.g.  Then build the UnitTest project with a "Conditional compilation symbol:"/DEFINE of AST (in VS project properties).  Then the call to compilation_unit() will return you an Abstract Syntax Tree, which is what you want.

HTH

Nov 3, 2010 at 10:26 PM

Hi, FWIW I did read the other posts here and had tried that before I posted here - but maybe I'm doing something wrong. When I try to parse, for example, the Args.cs included in the unit test project (shown here for reference)


using System;

namespace Util
{
    public static class Args
    {
        public static bool IsFlagSet(string flag)
        {
            return flag == Array.Find(Environment.GetCommandLineArgs(), s => s == flag);
        }
    }

}

I obtain a tree with 43 direct children - one for each element in the file.

So the tree is like

Root
    using
    System
    ;
    namespace
    Util
    {
    public
    static
    class
    Args
    {
    public
    static
    bool
    IsFlagSet
    (
    string
    flag
    )
    {
    return
    flag
    ==
    Array
    .
    Find
    (
    [etc]

This is the flat tree I referred to in my first post. I was hoping to obtain a tree that was hierarchially organised in some convenient manner, e.g:

Root
    using System
    namespace Util
        public static class Args
            public static bool IsFlagSet
                // args here
                // impl here
        // other class here
    // other namespace here



Obviously I am new to parsing and I guess there are many ways such a tree could be organised and many new problems in representation I hadn't considered, but is this possible from this project? If not, does anybody know of anything out there which can?


Coordinator
Nov 3, 2010 at 11:38 PM
Run unittest with -n and you will see a tree like you are looking for.
Nov 4, 2010 at 3:30 AM
Edited Nov 4, 2010 at 9:16 PM

Howdy, thanks for the prompt response. I did also try this before the post, but as far as I can tell it just steps through the flat list and prints each token, which is in keeping with my original understanding.

I've now stepped through the code with a debugger, and it does just iterate through the CommonTreeNodeStream as if it was a flat list. At no time does it go to the code blocks related to the DOWN or UP segments (I'm not exactly sure what these do)

C:\lib\antlrcsharp100620\UnitTest\bin\Debug>Parse c:\lib\antlrcsharp100620\UnitTest\Args.cs -n
---------------
c:\lib\antlrcsharp100620\UnitTest\Args.cs
c:\lib\antlrcsharp100620\UnitTest\Args.cs
parser using rule -- compilation_unit:
Nodes
  using System ; namespace Util { public static class Args { public static bool IsFlagSet (
string flag ) { return flag == Array . Find ( Environment . GetCommandLineArgs ( ) , s => s
 == flag ) ; } } }
Parsed 1 of 1 files. (100%)

edit: edit the output to remove the long single line that was breaking forums.

Coordinator
Nov 4, 2010 at 6:23 PM

I'm sorry I forgot I took out all the tree rewrite syntax from cs.g to make things simpler.

Here's how to tell ANTLR to do that.

cs.g

options
{
....
}

tokens
{
    USING;
}
....
using_namespace_directive:
	'using'   namespace_name   ';' -> ^(USING namespace_name)

That creates a USING root, with a namespace_name child.

Nov 4, 2010 at 8:54 PM

Hey anbrad, not a problem, thank you for your patience in dealing with these beginner questions... And now another one :)

I added the changes you suggested and got this output (not that I've limited column width to 80 chars to prevent it breaking forum, a pastebin of output is here http://pastebin.com/guGCNGbj:

C:\lib\antlrcsharp100620\UnitTest\bin\Debug>Parse ..\..\Args.cs -n
---------------
..\..\Args.cs
C:\lib\antlrcsharp100620\UnitTest\bin\Debug\..\..\Args.cs
parser using rule -- compilation_unit:
Nodes

   USING System
 namespace Util { public static class Args { public static bool IsFlagSet ( stri
ng flag ) { return flag == Array . Find ( Environment . GetCommandLineArgs ( ) ,
 s => s == flag ) ; } } }
Parsed 1 of 1 files. (100%)

So this does create a using Root... but everything else continues to be flat.

I guess I just wanted to check that you were just providing an example of how to make a rewrite rule, and from here if I want a tree output like I talked about earlier I need to start learning about tree construction and rewrite rules in order to make further modifications to cs.g?

Coordinator
Nov 5, 2010 at 3:14 PM

That's right I just was giving an example of creating one rewrite rule.  You will have to write one for all the rules that you are interested in to see a complete tree.