2
How do I improve this code?
import random
import string
import requests
from django.core.management.base import BaseCommand, CommandError
from django.core.exceptions import ValidationError
from optparse import make_option
from core.models import Movie
class Command(BaseCommand):
help = 'Faz o crawler numa api de filmes e retorna os dados.\n \
Uso: ./manage.py initdata\n \
ou: ./manage.py initdata -m 20\n \
ou: ./manage.py initdata -m 20 -y 2015'
option_list = BaseCommand.option_list + (
make_option('--movies', '-m',
dest='movies',
default=10,
help='Define a quantidade de filmes a ser inserido.'),
make_option('--year', '-y',
dest='year',
default=None,
help='Define o ano de lançamento do filme.'),
)
def get_html(self, year):
'''
Le os dados na api http://www.omdbapi.com/ de forma aleatoria
e escolhe um filme buscando por 2 letras
'''
# escolhe uma letra aleatoriamente
c = ''.join(random.choice(string.ascii_lowercase)for _ in range(2))
# se não for definido o ano, então escolhe um randomicamente
if year is None and isinstance(year, int):
year = str(random.randint(1950, 2015))
url = 'http://www.omdbapi.com/?t=' + c + \
'*&y=' + str(year) + '&plot=short&r=json'
return requests.get(url).json()
def save(self, title, year, released, director, actors, poster, imdbRating, imdbVotes, imdbID):
''' transforma "imdbVotes" em numero inteiro '''
if imdbVotes == "'N/A'":
imdbVotes = 0
else:
imdbVotes = imdbVotes.replace(',', '')
try:
''' Salva os dados '''
Movie.objects.create(
title=title,
year=year,
released=released,
director=director,
actors=actors,
poster=poster,
imdbRating=imdbRating,
imdbVotes=imdbVotes,
imdbID=imdbID,
)
except ValidationError:
print('O objeto não foi salvo.')
def handle(self, movies, year, **options):
''' se "movies" não for nulo, transforma em inteiro '''
if movies is not None:
movies = int(movies)
''' busca os filmes n vezes, a partir da variavel "movies" '''
for i in range(movies):
h = self.get_html(year)
j = 1 # contador
''' Se a resposta for falsa, então busca outro filme '''
while h['Response'] == 'False' and j < 100:
h = self.get_html(year)
print('Tentanto %d vezes' % j)
j += 1
''' se "year" não for nulo, transforma em inteiro '''
if "–" in h['Year']:
h['Year'] = year
# salva
self.save(h['Title'], h['Year'], h['Released'], h['Director'], h['Actors'],
h['Poster'], h['imdbRating'], h['imdbVotes'], h['imdbID'])
print(i + 1, h['Year'], h['Title'], h['Released'], h['Director'],
h['Actors'], h['Poster'], h['imdbRating'], h['imdbVotes'], h['imdbID'])
# print('Foram salvos %d filmes' % movies)
For example,
def save(self, title, year, released, director, actors, poster, imdbRating, imdbVotes, imdbID):
Enter only a dictionary of arguments.
title=title,
year=year,
released=released,
director=director,
actors=actors,
poster=poster,
imdbRating=imdbRating,
imdbVotes=imdbVotes,
imdbID=imdbID,
Can you turn it into something smaller, a dictionary maybe.
self.save(h['Title'], h['Year'], h['Released'], h['Director'], h['Actors'],
h['Poster'], h['imdbRating'], h['imdbVotes'], h['imdbID'])
Could also be optimized?
I strongly recommend reading the "Clean Code": http://www.buscape.com.br/codigo-limpo-robert-cecil-martin-8576082675.html . And avoid using "Try except" in your codes, even more so if you will just say: "print('The object has not been saved.')". You put a Try except because you expect an error. What’s the point of making a piece of code that causes an error? Python’s Zen says: "errors should not be silenced unless they are explicitly silenced". If you have an except, you must somehow receive a traceback email or some notification that such a problem has happened.
– Puam Dias
It’s worth your search for bulk_create as well. Instead of making 100 Movie.objects.create(), you would make 1 Movie.objects.bulk_create(list). It would optimise this process ai. https://docs.djangoproject.com/en/1.8/ref/models/querysets/#Bulk-create
– Puam Dias
@Puamdias I agree with almost everything you say, except in this matter of
try...except
. The ideal would be to inform the reason for the error, of course, but in this case it is obvious that it is a validation error, and if the intention of AP is that the program does not stop after finding an invalid data (i.e. that it ignores the invalid and continues processing the others) then there’s not much out but to usetry...except
. Likewise,bulk_create
would be ideal in an "all or nothing" scenario, but in the same way would not be able to use (I think) if you wanted to process valid ones and ignore invalid ones.– mgibsonbr
@mgibsonbr, what I meant in this part of Try except, is that many people find it "normal" to put this in any part of the code. In this case it is fair to have a Try except because it is importing something from external agent the application of it (it seems right), but it needs to have a notification of the reason of traceback. If the imported data is from himself, I think "nonsense" develop something I expect error from that and have to put Try except in the code.
– Puam Dias
I put
try
because I didn’t handle all errors from the api data.– Regis Santos