Surely there is a way to do this without side effects.
The idea is to use a stateless data structure to represent context. The function that evaluates the AST takes the previous context and returns a new context according to the instructions in the AST.
On the question of creating and destroying new contexts, it is very simple, the program will not have 1 context only, but a list of contexts, then you add a new context when starting a function, and remove when finishing the function. (As I said above, in practice you will not actually add and remove, but rather create new lists)
I’ve never programmed in Elixir, but I imagine it provides stateless structures. All functional language available, otherwise it would be impossible to program in them.
Below I will leave an implementation that I made a Javascript. Note that I do not modify the context anywhere, that is, this code can be peacefully converted to a purely functional language.
I left some comments in the code that will be useful for your understanding.
const ast = {
"type": "program",
"statements": [
// var a = 1
{
"type": "decl",
"id": "a",
"expr": {
"type": "literal",
"value": "1"
}
},
// function f { ... }
{
"type": "fn",
"id": "f",
"statements": [
// Variável `a` está visível pois é top-level.
// var x = a
{
"type": "decl",
"id": "x",
"expr": {
"type": "var",
"id": "a"
}
},
// Variável `x` está visível pois é do mesmo contexto.
// var y = x
{
"type": "decl",
"id": "y",
"expr": {
"type": "var",
"id": "x"
}
},
],
},
// f()
{
"type": "fn-call",
"id": "f",
},
// Variável `a` permanece visível
// var b = a
{
"type": "decl",
"id": "b",
"expr": {
"type": "var",
"id": "a"
}
},
// O nodo abaixo dá erro! A variável `x` não está mais acessível aqui!
// // var c = x
// {
// "type": "decl",
// "id": "c",
// "expr": {
// "type": "var",
// "id": "x"
// }
// },
]
}
// Função helper pra deixar o código mais legível.
function last(xs) {
return xs[xs.length - 1]
}
// Função helper pra deixar o código mais legível.
function allButLast(xs) {
return xs.slice(0, -1)
}
// Procura em todos os contextos acumulados, ou seja, considera variáveis
// top-level.
// Apesar de usar laços de repetição, perceba que o código é stateless e pode
// ser facilmente convertido pra uma lógica recursiva.
// Perceba, ainda, que o uso do throw é muito feio, fiz assim apenas por
// simplicidade. Retorne um monada que represente a possibilidade de não
// encontrar o item, dessa forma isso fica explícito no sistema de tipos.
function findInContext(context, type, id) {
for (const subCtx of context) {
for (const item of subCtx) {
if (item["type"] === type && item["id"] === id) {
return item
}
}
}
throw new Error(`Identificador ${id} não encontrado`)
}
// Optei por separar a evaluação de expressões e de statements. Isso é gosto
// pessoal, no teu código faça como achar mais apropriado.
function evaluateExpr(expr, context) {
switch (expr["type"]) {
case "literal": {
return expr["value"]
}
case "var": {
const item = findInContext(context, "var", expr["id"])
return item ? item["value"] : null
}
}
}
// A função `evaluate` nunca modifica o contexto, apenas gera novos contextos
// de acordo com o fluxo do programa. Totalmente stateless.
function evaluate(ast, context) {
switch (ast["type"]) {
case "program": {
const newContext = [[]]
const f = (ctx, stmt) => evaluate(stmt, ctx)
return ast["statements"].reduce(f, newContext)
}
case "decl": {
const expr = evaluateExpr(ast["expr"], context)
const id = ast["id"]
const item = { "type": "var", "id": id, "value": expr }
return [...allButLast(context), [...last(context), item]]
}
case 'fn': {
const id = ast["id"]
const statements = ast["statements"]
const item = { "type": "fn", "id": id, "statements": statements }
return [...allButLast(context), [...last(context), item]]
}
case 'fn-call': {
const func = findInContext(context, "fn", ast["id"])
// Cria um novo contexto no qual as variáveis da função serão
// declaradas.
const newContext = [...context, []]
const f = (ctx, stmt) => evaluate(stmt, ctx)
func["statements"].reduce(f, newContext)
// Aqui tá o pulo do gato. Ao encerrar a execução da função,
// retorna o contexto antigo, ou seja, ignora todas as variáveis
// que foram declaradas dentro da função.
return context
}
}
}
evaluate(ast)
I’m sorry but it’s not very clear, how to perform an analysis means an algorithm? as you quoted a compiler, has a code? need to see the logic used to better understand, need to give more details
– Ricardo Pontual
I’m sorry to disagree but it’s clear, what I need to do is perform the semantic analysis, so that’s enough to understand the question, but I’ve put a limitation that is to perform this compilation step using only resources available in pure functional languages. This implies using only functions and not saving an external state as a global object to store the current context.
– pic
Also in the link provided follows my attempt to perform semantic analysis, but I "stole" using the
Agent
available inElixir
to save some processing state out of function.– pic
That’s a great question for https://cs.stackexchange.com
– Piovezan
Thanks for the tip, I’ll translate the question and send there too
– pic
Elixir and Erlang to me is Greek, seeing his project I could not figure out how he creates the context structures, but usually use maps (dictionaries) chained which is a stack of key value structures where a name is initially searched to the map further to the top and while it is not found it is then searched down the stack. The context toward the base of the stack is wider (overall) while the context toward the top of the stack is more restricted (local).
– Augusto Vasques
If you refer to AST you have two codes in the folder
src/
, one for the lexical analyserlexer.xrl
and one for the parser that generates my ASTparser.yrl
. I use the tools of Erlang LEEX and YEEC and pass the output (tokens) of thelexer.xrl
directly to theparser.yrl
producing the AST noted in the filecompiler.ex
immediately before starting the translation with the functiontranslate/2
. This mechanics is similar to Flex/Lex + Yacc/Bison.– pic
In the link code I only and only one context and store it within a
Agent
right at the beginning of the translation/semantic analysis, which is similar to an object (saved the differences of paradigms and the functionality itself of the thing) but I can insert and change states in it, in case I DO NOT want to use thisAgent
to store my nested contextual structure.– pic
But with the agent yes, you could then push/pop on top of a stack of contexts that would be stored on
Agent
. So the problem is to generate this stack of contexts outside theAgent
, and a bigger problem is going through/checking it since I don’t store the state outside theAgent
.– pic