Erlang Programming/Making Parsers with yecc

Making Parsers with yecc

Yecc is an erlang version of yacc/bison.

We have a BNF(Backus-Naur_form) grammar in a source file ending in .yrl, yrl means yecc rule list. We can parse a simple xhtml file using yecc. Actually, we will apply yecc to html.yrl to create a parser called html_parser.erl. Next we use the html_parser to parse some xhtml, voila. yecc:yecc("html.yrl","html_parser.erl"). c(html_parser). f(B), {_,B,_} = erl_scan:string( "   hello_world  "). html_parser:parse(B). All tags in the xhtml code must have matching open and close. (Of course a more powerful way to parse an xml file in erlang is to use xmerl).

html.yrl source: Nonterminals tag elements element start_tag end_tag. Terminals 'atom' '<' '>' '/'. Rootsymbol tag. tag -> start_tag tag end_tag : ['$1', '$2', '$3']. tag -> start_tag tag tag end_tag : ['$1', '$2', '$3', '$4']. tag -> start_tag elements end_tag : ['$1', {'contents','$2'}, '$3']. tag -> start_tag end_tag : ['$1','$2']. start_tag -> '<' 'atom' '>' : {'open','$2'}. end_tag -> '<' '/' 'atom' '>' : {'close','$3'}. elements -> element : ['$1']. elements -> element elements : ['$1', '$2']. element -> atom : '$1'. % yecc:yecc("html.yrl","html_parser.erl"). % c(html_parser). % f(B), {_,B,_} = % erl_scan:string( % "   hello_world  "). % html_parser:parse(B). It can be a pain to build and run a parser each time we edit the source yrl file. To speed things up, we can use a program to build and run the parser for us. We compile and run the test program which builds the parser and tests it for us on some document. -module(html_test). -compile(export_all). start -> yecc:yecc("html.yrl","html_parser.erl"), cover:compile(html_parser), {_,List_of_symbols,_}=erl_scan:string( 		"  greeting   			hello there world what is up 		 "), {ok,L} = html_parser:parse(List_of_symbols), register(do_event, spawn(html_test,event_loop,[])), Events = lists:flatten(L), send_events(Events), Events. send_events([]) -> do_event ! {exit}; send_events([H|T]) -> do_event ! H, 	%io:format(" ~w ~n",[H]), send_events(T). event_loop -> receive {open,{atom,_Line_Number,html}} -> io:format("~n start scan ~n", []), event_loop; {contents,List} -> Contents = get_contents(List,[]), io:format("~n contents: ~w ~n", [Contents]); {exit} -> exit(normal) end, event_loop. get_contents([],Items) -> Items; get_contents([H|T],Items)-> if length(T) > 0 -> NT = hd(T); true -> NT = T 	end, {atom,_N,Item} = H, 	NItems = Items++[Item], % io:format(" ~w ",[Item]), get_contents(NT,NItems). % 6> c(html_test). % {ok,html_test} % 7> html_test:start. % [greeting] % [hello,there,world,what,is,up] % and events.