How to sort and sequence data from a file

Asked

Viewed 451 times

0

I developed a program that stores a list of ids, so

inserir a descrição da imagem aqui

But for the desired purpose, the data should take the sequential form, so that the first pair of ids is something like: "889926212541448192" becomes 1 and "889919950248448000" becomes 2. That is, the file to be obtained should be something like:

Exemplo

Where the first id connects with 2,3 and 6, and the id 4 only with the 5, forming a network.

I have no experience in this area, but I can’t find a way to do this reading.

I tried to make some programs, but they read only row and not column id to id. This data is saved by following the following program

import json

Arq = open('ids.csv','w') Arq.write('Source'+','+'Target') Arq.write(" n")

list network = [] #list to store all ids

with open('dados_twitter.json', 'r') as f:

for line in f:
    lista = []

    tweet = json.loads(line) # reescreve como um dicionário Python 
    lista = list(tweet.keys()) #escreve lista das chaves 

    try:
        if 'retweeted_status' in lista:
            id_rt = json.dumps(tweet['retweeted_status']['id_str'])
            id_status = json.dumps(tweet['id_str'])

            lista_rede.append(tweet['id_str'])
            lista_rede.append(tweet['retweeted_status']['id_str'])

            arq.write( id_status +','+ id_rt )
            arq.write("\n")

        if tweet['quoted_status'] in lista :
            id_rt = json.dumps(tweet['quoted_status']['id_str'])
            id_status = json.dumps(tweet['id_str'])

            lista_rede.append(tweet['id_str'])
            lista_rede.append(tweet['quoted_status']['id_str'])

            arq.write( id_status +','+ id_rt )
            arq.write("\n")
    except:
           continue

Arq.close()

As a result I have a file with the ids data in pairs of interactions

How can I rearrange this data in reading, or even writing it?? In python or in another language?

  • 1

    Please do not put the codes as image, the site has support for them. Edit the question and enter them correctly.

1 answer

1


Somewhat confusing question,and pasting the code instead of printing would also help...anyway,if I understood correctly you want each ID to be replaced by a corresponding number,incrementing a unit to each new ID.I made a function that takes a list as argument and returns it this way. Follows:

def muda_ids(lista):
    antigos_ids = []
    for linha in lista:
        ids = linha.split(',')
        for id in ids:
            if id not in antigos_ids:
                antigos_ids.append(id)
    for cont in range(len(antigos_ids)):
        lista = [w.replace(antigos_ids[cont], str(cont)) for w in lista]
    return lista

For example, by passing as argument the following list:

['100,110', '100,200', '300,154', '400,156', '100,110']

The function returns:

['0,1', '0,2', '3,4', '5,6', '0,1']
  • Thank you very much, I rewrote the way I created the file to create a list that in the way you made the example and used this replace function, as I am not from the area did not know its power and is being real important. I will write a file now with this new list. Thank you.

  • You are welcome, I am happy to help :) If you are satisfied with the answer, mark it as accepted

Browser other questions tagged

You are not signed in. Login or sign up in order to post.