How to perform semantic analysis using pure functional programming without side Effect?

Asked

Viewed 123 times

5

I wonder if there is any way to know what the current context is without using side effects as Agente of the elixir , letting semantic analysis be carried out along a pipeline of functions that walk through the AST.

Some things I’m more interested in:

  • Check whether a variable has been declared or not.

  • How to know which context of the code is being analyzed, for example knowing that the variable no longer exists after leaving a block in the language c:

int a;
{
    int b = 0;
}
// b não está mais válida para ser usada aqui

Here was a small compiler that I did but I have to instantiate all the variables first since I only have a global context.

  • I’m sorry but it’s not very clear, how to perform an analysis means an algorithm? as you quoted a compiler, has a code? need to see the logic used to better understand, need to give more details

  • I’m sorry to disagree but it’s clear, what I need to do is perform the semantic analysis, so that’s enough to understand the question, but I’ve put a limitation that is to perform this compilation step using only resources available in pure functional languages. This implies using only functions and not saving an external state as a global object to store the current context.

  • Also in the link provided follows my attempt to perform semantic analysis, but I "stole" using the Agent available in Elixir to save some processing state out of function.

  • 1

    That’s a great question for https://cs.stackexchange.com

  • Thanks for the tip, I’ll translate the question and send there too

  • 1

    Elixir and Erlang to me is Greek, seeing his project I could not figure out how he creates the context structures, but usually use maps (dictionaries) chained which is a stack of key value structures where a name is initially searched to the map further to the top and while it is not found it is then searched down the stack. The context toward the base of the stack is wider (overall) while the context toward the top of the stack is more restricted (local).

  • 1

    If you refer to AST you have two codes in the folder src/, one for the lexical analyser lexer.xrl and one for the parser that generates my AST parser.yrl. I use the tools of Erlang LEEX and YEEC and pass the output (tokens) of the lexer.xrl directly to the parser.yrl producing the AST noted in the file compiler.ex immediately before starting the translation with the function translate/2. This mechanics is similar to Flex/Lex + Yacc/Bison.

  • In the link code I only and only one context and store it within a Agent right at the beginning of the translation/semantic analysis, which is similar to an object (saved the differences of paradigms and the functionality itself of the thing) but I can insert and change states in it, in case I DO NOT want to use this Agent to store my nested contextual structure.

  • But with the agent yes, you could then push/pop on top of a stack of contexts that would be stored on Agent. So the problem is to generate this stack of contexts outside the Agent, and a bigger problem is going through/checking it since I don’t store the state outside the Agent.

Show 4 more comments

1 answer

1


Surely there is a way to do this without side effects.

The idea is to use a stateless data structure to represent context. The function that evaluates the AST takes the previous context and returns a new context according to the instructions in the AST.

On the question of creating and destroying new contexts, it is very simple, the program will not have 1 context only, but a list of contexts, then you add a new context when starting a function, and remove when finishing the function. (As I said above, in practice you will not actually add and remove, but rather create new lists)

I’ve never programmed in Elixir, but I imagine it provides stateless structures. All functional language available, otherwise it would be impossible to program in them.

Below I will leave an implementation that I made a Javascript. Note that I do not modify the context anywhere, that is, this code can be peacefully converted to a purely functional language.

I left some comments in the code that will be useful for your understanding.

const ast = {
    "type": "program",
    "statements": [

        // var a = 1
        {
            "type": "decl",
            "id": "a",
            "expr": {
                "type": "literal",
                "value": "1"
            }
        },

        // function f { ... }
        {
            "type": "fn",
            "id": "f",
            "statements": [

                // Variável `a` está visível pois é top-level.
                // var x = a
                {
                    "type": "decl",
                    "id": "x",
                    "expr": {
                        "type": "var",
                        "id": "a"
                    }
                },

                // Variável `x` está visível pois é do mesmo contexto.
                // var y = x
                {
                    "type": "decl",
                    "id": "y",
                    "expr": {
                        "type": "var",
                        "id": "x"
                    }
                },

            ],
        },

        // f()
        {
            "type": "fn-call",
            "id": "f",
        },

        // Variável `a` permanece visível
        // var b = a
        {
            "type": "decl",
            "id": "b",
            "expr": {
                "type": "var",
                "id": "a"
            }
        },

        // O nodo abaixo dá erro! A variável `x` não está mais acessível aqui!
        // // var c = x
        // {
        //     "type": "decl",
        //     "id": "c",
        //     "expr": {
        //         "type": "var",
        //         "id": "x"
        //     }
        // },
    ]
}

// Função helper pra deixar o código mais legível.
function last(xs) {
    return xs[xs.length - 1]
}

// Função helper pra deixar o código mais legível.
function allButLast(xs) {
    return xs.slice(0, -1)
}

// Procura em todos os contextos acumulados, ou seja, considera variáveis
// top-level.
// Apesar de usar laços de repetição, perceba que o código é stateless e pode
// ser facilmente convertido pra uma lógica recursiva.
// Perceba, ainda, que o uso do throw é muito feio, fiz assim apenas por
// simplicidade. Retorne um monada que represente a possibilidade de não
// encontrar o item, dessa forma isso fica explícito no sistema de tipos.
function findInContext(context, type, id) {
    for (const subCtx of context) {
        for (const item of subCtx) {
            if (item["type"] === type && item["id"] === id) {
                return item
            }
        }
    }

    throw new Error(`Identificador ${id} não encontrado`)
}

// Optei por separar a evaluação de expressões e de statements. Isso é gosto
// pessoal, no teu código faça como achar mais apropriado.
function evaluateExpr(expr, context) {
    switch (expr["type"]) {
        case "literal": {
            return expr["value"]
        }

        case "var": {
            const item = findInContext(context, "var", expr["id"])
            return item ? item["value"] : null
        }
    }
}

// A função `evaluate` nunca modifica o contexto, apenas gera novos contextos
// de acordo com o fluxo do programa. Totalmente stateless.
function evaluate(ast, context) {
    switch (ast["type"]) {
        case "program": {
            const newContext = [[]]
            const f = (ctx, stmt) => evaluate(stmt, ctx)
            return ast["statements"].reduce(f, newContext)
        }

        case "decl": {
            const expr = evaluateExpr(ast["expr"], context)
            const id = ast["id"]
            const item = { "type": "var", "id": id, "value": expr }
            return [...allButLast(context), [...last(context), item]]
        }

        case 'fn': {
            const id = ast["id"]
            const statements = ast["statements"]
            const item = { "type": "fn", "id": id, "statements": statements }
            return [...allButLast(context), [...last(context), item]]
        }

        case 'fn-call': {
            const func = findInContext(context, "fn", ast["id"])
            // Cria um novo contexto no qual as variáveis da função serão
            // declaradas.
            const newContext = [...context, []]
            const f = (ctx, stmt) => evaluate(stmt, ctx)
            func["statements"].reduce(f, newContext)
            // Aqui tá o pulo do gato. Ao encerrar a execução da função,
            // retorna o contexto antigo, ou seja, ignora todas as variáveis
            // que foram declaradas dentro da função.
            return context
        }
    }
}

evaluate(ast)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.