list, max/min in python

Question

list, max/min in python

Asked 5 years, 3 months ago

Viewed 4,333 times

3

Hello, I need to make a python algorithm that reads a text file .csv which contains literally 5 million numbers and I need this algorithm to tell me which is the smallest and which is the largest number on the list. Now, the problems:

I used this code to open the list in python:

import csv 

lista = open('lista.csv', 'r') read: 
csv.reader(lista) 

for linha in reader: 
    print (linha)

It works normal, but to present the largest and the smallest, it would be this:

import csv 

lista = open('lista.csv', 'r') reader: 
csv.reader(lista) 

for linha in reader: 
    print linha 

    menor = min(linha) 
    maior = max(linha) 

    print (menor, maior)

The algorithm works, the real problem is that it appears that the smallest value is null and the greatest value is -83422495.2710933

We have already tried to put separate (an algorithm for higher number and one for the lower number) and no use, we have also tried to take out the 'for' and it does not work...

I wanted to know if there’s another way to do it or if we’re missing it... I’d like to thank you very much.

1

I think there’s only one sample of csv missing to see how the values are arranged, if it’s one value per line and it’s 5 million lines, or it’s 5 million numbers on a line?

– Elton Nunes

2019/03/12 at 01:16
There are 9 or 10 columns with 5 million lines. I believe the program is reading all the numbers

– Isabella Rosa

2019/03/12 at 01:28

2 answers

0

The problem was solved like this:

import re

pattern = re.compile(r"-?\d+\.?\d*")

with open("C:/Users/lab2d/Downloads/lista2.txt") as f:
    numeros = pattern.findall(f.read())

numeros = [float(i) for i in numeros]

if numeros:
    print("Maior valor:", max(numeros))
    print("menor:", min(numeros))

Browser other questions tagged python python-3.x list algorithm

You are not signed in. Login or sign up in order to post.

by Sidon • **6,563** points · Answer 1 · 2019-03-12T02:02:10+00:00

You can use pandas, I created an example with only 36 random numbers (3 rows of 12 columns) generated randomly to simulate your csv, I read this 'file' for a Dataframe pandas object and then present the minimum and maximum values.

import io
import pandas as pd

# Simulando o csv
lista = '''
6848, 8453, 6877, 3508, 2071, 1962, 7274, 4901, 9369, 3498, 2138, 2504, 9948
6543, 7021, 260, 2392, 648, 9947, 6759, 3553, 3437, 4121, 2637, 8067, 9421 
6609, 5229, 1872, 2288, 8448, 9701, 1256, 4489, 7549, 2844, 4561, 3291, 5472 
'''

# Lendo o csv
df = pd.read_csv(io.StringIO(lista), header=None)

# Apresentando o resultado
print('Valor máximo:', df.values.max())
print('Valor mínimo:', df.values.min())

Exit:

Valor máximo: 9948
Valor mínimo: 260

Obs.
1. I assumed that your csv has no header for the columns, if you remove the header=None of the csv read command.
2. If you want to/need more functions such as sum(), mean(), etc..

Edited
To test the possibility that the amount of data is a problem, I created an example where I create a Dataframe with 6mi of numbers extracted from a randomly generated numpy array, then present the minimum and maximum values, and the average of all values.

import numpy as np
import pandas as pd
qt = 6000000

df = pd.DataFrame(np.random.randint(0,qt,size=(1000000,6)))

print('Valor máximo:', df.values.max())
print('Valor minimo:', df.values.min())
print('Média dos valores:', df.values.mean())

Exit:

Valor máximo: 5999999
Valor minimo: 2
Média dos valores: 2999119.789572667

Even without pandas, it is possible to extract the maximum and minimum values directly from a list with 5mi of python data "pure":

lista = list(range(0, 5000000))
print('Máximo:',max(lista))
print('Mínimo:',min(lista))

Exit:

Máximo: 4999999
Mínimo: 0