Take an element from a list and search in another list

Asked

Viewed 99 times

2

I am having a problem related to lists, I have some files to access and pass them to lists (3 files to 3 lists).

Basically one of the files would have technical data of IBGE, and I need to take this number of his registration and look in another list this string to validate if the technical really exists, You need to do the same for another list that holds the cities where the registrations were made.

I came to take a look related to searching in list with > index but from what I understand it looks for the first element repeated and shows its position;

I also looked at the if element in list, however what I wish is to take the registration of the technician (a file) and the city (another file) and search in the search of the IBGE (the third file) if that technician and the city exists, so I would validate and with that I would create statistics according to the response of the staff who answered the survey.

So far I’ve made a simple code with what I’ve learned so far, I can access the files, I can give a split , to improve the visualization, I know printar the list elements (the way I did the elements are separated on each line, but it would be preferable to separate them into smaller parts.

Example of how it is now:

 'T010;4404;08430-026;6;64;6;4;2;-;4;6;2;-;2;1;1;7;1992;4;1;1\n', 'T011;866;04854-280;1;62;6;1;2;-;10;5;3;-;6;1;2;4;1970;5;2;3\n', 
A melhor forma seria algo tipo 'T010;','4404;','08430-026;','6;','64;','6;','4;','2;','-;','4;','6;','2;'...\n',

),

In case I took this '4404' that would be searched in an archive of regions, that would return me the city if it existed this 4404, it would take the city related to this number and in conjunction with the registration of the technician: 'T010' would search in the research file of the IBGE.

Look at my code so far:

 `f = open('exemplopesquisa.txt', 'r')
matrizex=f.readlines()
print(matrizex)
for line in matrizex: 
    #Separa a string por ;
    Type = line.split(";") 
    a = Type[0] 
    b = Type[1]
    c = Type[2]
    d = Type[3]
    e = Type[4]
    f = Type[5]
    g = Type[6]
    h = Type[7]
    i = Type[8]
    j = Type[9]
    k = Type[10]
    l = Type[11]
    m = Type[12]
    n = Type[13]
    o = Type[14]
    p = Type[15]
    q = Type[16]
    r = Type[17]
    s = Type[18]
    t = Type[19]
    v= Type[20]
    print(a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,v) 
print(matrizex[0:801])
if
print(len(matrizex))
print(len(line))
if 'T001' in line[0:60]:
    print("TUDO OK")
for linha in matrizex:
    if 'T001' in linha:
        print("ok")`

The files are in this folder: Link to the txt files

  • Can you clearly place an example of content for each of the files ? Preferably with more than one line

1 answer

2

There are several issues to be worked on in the proposed code, but congratulations on the eloboration of the question. Showed effort.

use the command with open when opening Python files while defining their scope better, as well as already taking care of closing the file as soon as the scope is finished.

As you are working with an IBGE txt there will probably be characters in UTF-8. It is always good to specify this when opening text files. To do so, use encoding.

matrizex = list()
with open('arquivo.txt', 'r', encoding='utf-8') as f:
    for line in f.readlines():
        line = line.strip() # limpar possibilidade de espaços adicionais
        matrizex.append(line.split(";"))

In matrizex you will have a list of lists, ie a list of lines of your file.

You can repeat this process for each of the files and save in different lists.

Identify the position of the variables you would like to identify, from what I understand, the ID is always the first value in the file you presented.

Let’s say you want to tap if this ID exists in another file. To do so, let’s assume that you repeated the above process and saved the other file in the same way, but in the variable matrizey.

So, to identify if the ID (which is always at index 0 of the lines) exists also in the other file (let’s assume that it is also at index 0 of the lines), you would do the following:

for line in metrizex:
    ID = line[0]
    for line2 in metrixey:
        if ID == line2[0]:
            print("ID:{} exists in both files!".format(ID)) # print the result
            # do something

You can repeat this process in the other file as well and with other Index to vary the variables you need.

A suggestion also, to analyze tabular data in Python is to use the Pandas library. It greatly facilitates this kind of operation you need. Follow the link for you to take a look: https://pandas.pydata.org/

I hope I’ve helped!

Browser other questions tagged

You are not signed in. Login or sign up in order to post.