Graph using python-igraph with attributes per node and edges

Asked

Viewed 2,433 times

1

Talk people! I’m trying to create a graph using the Python-Igraph library. Each node would have an attribute, for example, in a book graph whose nodes would be books and the attributes of those nodes would be:

titulo: nome_do_titulo
sinopse: descricao_do_livro
autor: nome_do_autor
prefacio: descricao_do_prefacio

and all these node attributes would have edges linking to similar attributes of other nodes containing the weight of the similarity between these attributes. So each knot would have 4 edges for another knot. I’m using python 3 and igraph.

The way I’m implementing it is as follows:

    g = Graph()
    g.vs["titulo"] = ['titulo1', 'titulo2', 'titulo3']
    g.vs["sinopse"] = ["sinopse1", "sinopse2", "sinopse3"]
    g.vs["autor"] = ["autor1","autor2","autor3"]
    g.vs["prefacio"] = ["prefacio1","prefacio2", "prefacio3"]

Now how to put the link between them I know what is used g.es but the Igraph site is a bit vague on that.

1 answer

1

I didn’t know this igraph - apparently it is to library for working with Python graphs.

So, taking a look at the documentation and experimenting with the interactive Python prompt (This is the secret to figuring out how to do things) - I got the following:

The library does not support "1st class" for edges linked to specific node attributes - (or perhaps, the support it has considered first class - I don’t know :-) )

But, since edges can have arbitrary attributes, you can put a "type" attribute on each edge - this way you will know which attribute it concerns. And, being complete, it perfectly supports more than one edge connecting the same pair of nodes, so you can have 2 edges relating two nodes that you want to link both by title and by author.

The only way to create an edge is to pass the numeric indices of the nodes it is connecting - I suggest you put a "type" attribute (or "type" - it is always better to have all your program - variables including English - you never know when the project will grow to have international collaborators. But more important than being in English is to be all variables and functions in the same language - started with Portuguese, continue with everything in Portuguese) But then, to connect the "0" and "1" nodes by the title attribute, call

g.add_edge(source=0, target=1, tipo="titulo")

To find the indexes of the nodes you want to connect to each other, if you want an exact match (==) of parameters, you can use the search with "find" on the "g.vs" object - for other searches, you can use the filter with "if" from normal Python - and then use the itertools.combinations to have all possible pairs of edges.

Example: let’s assume that you want to connect all nodes whose titles contain the word "Antarctica" - you can use a function of this type:

import itertools 

def conecta(grafo, palavra, atributo):
   nos_relevantes = [no.index for no in grafo.vs if palavra in getattr(no, atributo, "").lower()]
   for origem, destino in itertools.product(nos_relevantes, 2):
       grafo.add_edge(source=origem, target=destino, tipo=atributo)

I think so far your question has been answered. (test at the interactive prompt these expressions, understand how itertools.product works, for example, etc...)

Now, I haven’t seen enough of the igraf to know if you’re going to cosneguir to do what you want with the edges distinguishing only by this "type" attribute - now something you can surely do is, when you need to use specific edges, create copies of the graph, and filter into the new graph only those edges that are relevant:

from copy imoport deep_copy
g_titulo = deepcopy(g)
g_titulo.delete_edges([e.index for  in g_titulo.es if e["tipo"] != "titulo"])

In time: this igraph API to associate the parameters to nodes from a list is kind of crazy - I don’t know if it’s the way you’d prefer to use - I’d be more comfortable creating each node at once, passing all the linked parameters - It’s how we work in Python. The igraph supports this. That is, instead of creating the vertices as you are doing, you can do:

g = igraph.Graph()
dados = [{"titulo": "titulo1", "sinopse": "sinopse1", "autor": "autor1"}, {"titulo": "titulo2", "sinopse": "sinopse2", "autor": "autor2"}, ]
for dado in dados:
    g.add_vertex(**dado)

That way it’s easy to put your initial data into a spreadsheet, save as CSV, er with the csv.DictReader from Python to create a node for each of your books, for example - and you don’t have to worry - parallel to the graph - to keep lists with each attribute in the same order as the vertices.

  • Each book in the case can have different attributes right? Type some may miss the Synopsis for example?

  • They can - but the igraph creates attributes for all books - they are "blank" if they are not explicitly associated. (I think they are read as "None"). The fact is that if you have a graph with 100 books, and create an attribute for a single one of them, the attributes will appear in the other 99.

  • You have to use the ** to parameterize the graph ? can use without the ** ?

  • The igraph API expects each parameter to come as a named Parameter in the call - "**" is Python’s way of transforming a dictionary into a series of named Parameters at runtime. If you are using fixed brackets, of course you can do without **. That is g.add_vertex(autor=dado["autor"], titulo=dado["titulo"], ...) instead of g.add_vertex(**dado) - to the other side if "given" always has the same keys, the only reason not to do so is legibility.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.