2
I created a simple grammar to interpret a file whose format is much like a JSON. However, when I try to parse the file I get the exception Sytem.OutOfMemoryException
. This is because of the file size I am trying to parse. The file has 108MB and 4,682,073 lines.
When I parse smaller files, everything works normally, however, for this file, I realize that when the memory occupied by the process reaches almost 2GB the exception is triggered and the program stops. The exception comes from the code generated for the parser with the ANTLR extension for Visual Studio.
How do I run the parser for a really big files with ANTLR?
More information
The machine I’m running the parser has 8GB of memory, 2.8 Ghz processor (intel core 2 duo).
Example of the problem
Example file for reading
(
:field ("ObjectName"
:field (
:field ("{6BF621F9-A0E2-49BB-A86B-3DE4750954F4}")
:field (Value)
:field (Value)
:field (
:Time ("Sun Jan 26 10:08:33 2014")
:last_modified_utc (1390730913)
:By ("Mensagem qualquer")
:From (localhost)
)
:field ("Applications/application_fw1")
:field (false)
:field (false)
)
:field ()
:field ()
:field ()
:field (0)
:field (true)
:field (true)
)
.
.
.
Milhares de outros fields.
.
.
.
)
The grammar
grammar Objects;
/*
* Parser Rules
*/
compileUnit
: obj
;
obj
: OPEN ID? (field)* CLOSE
;
field
: ':'(ID)? obj
;
/*
* Lexer Rules
*/
OPEN
: '('
;
CLOSE
: ')'
;
ID
: (ALPHA | ALPHA_IN_STRING)
;
fragment
INT_ID
: ('0'..'9')
;
fragment
ALPHA_EACH
: 'A'..'Z' | 'a'..'z' | '_' | INT_ID | '-' | '.' | '@'
;
fragment
ALPHA
: (ALPHA_EACH)+
;
fragment
ALPHA_IN_STRING
: ('"' ( ~[\r\n] )+ '"')
;
WS
// : ' ' -> channel(HIDDEN)
: [ \t\r\n]+ -> skip // skip spaces, tabs, newlines
;
Execution of parser
// text é o texto do arquivo de 108MB que será lido.
var input = new Antlr4.Runtime.AntlrInputStream(text);
var lexer = new ObjectsLexer(input);
var tokens = new Antlr4.Runtime.CommonTokenStream(lexer);
var parser = new ObjectsParser(tokens);
// Contexto para a regra compileUnit
// ERRO: Aqui ocorre o problema. Quando inicia a montagem da árvore para compileUnit
// Não chega no Visitor, a exceção ocorre em compileUnit()
var ctx = parser.compileUnit();
// Execução do visitor
new ObjectsVisitor().Visit(ctx);
Can you please put an example of the code you are using for parse?
– Leonel Sanches da Silva
@Gypsy omorrisonmendez added an example
– anmaia