0
I have a Python REST API that takes 2 arguments, a Url list and a word. I am a beginner in Python and would like to know if it is possible to split the string of the URL to accept more than one URL qdo to request the body of the URL. For example, I can currently make the request on http://127.0.0.1/? urls=globe.com&word=google
Only that I would like to include more than one url in the body as http://127.0.0.1/? urls=globe.com&urls=terra.com.br&urls=Uol.com.br&word=google
Follows the code:
from flask import Flask
from flask_restful import Resource, Api, reqparse, abort
import requests
app = Flask(__name__)
api = Api(app)
parser = reqparse.RequestParser()
parser.add_argument('urls', action='append')
parser.add_argument('word')
parser.add_argument('ignorecase')
# Função que faz um GET para a URL e retorna quantas vezes a palavra word aparece no conteudo
def count_words_in(url, word, ignore_case):
    try:
        r = requests.get(url)
        data = str(r.text)
        if (str(ignore_case).lower() == 'true'):
            return data.lower().count(word.lower())
        else:
            return data.count(word)
    except Exception as e:
        raise e
# Função que inclui 'http://' na url e retorna a URL valida
def validate_url(url):
    if not(url.startswith('http')):
        url = 'http://' + url
    return url
class UrlCrawlerAPI(Resource):
    def get(self):
        try:
            args = parser.parse_args()
            valid_urls = [validate_url(url)  for url in args['urls']]
            lista = []
            for valid_url in valid_urls:
                lista.append({valid_url: {args['word']: count_words_in(valid_url, args['word'], args['ignorecase'])}})
                # return {valid_url: {args['word']: count_words_in(valid_url, args['word'], args['ignorecase'])}}
            return lista    
        except AttributeError:
            return {'message': 'Please provide URL and WORD arguments'}
        except Exception as e:
            return {'message': 'Unhandled Exception: ' + str(e)}
api.add_resource(UrlCrawlerAPI, "/")
if __name__ == '__main__':
    app.run(debug=True) 
for valid_url in valid_urls: return .... Here you are just reading the first URL and leaving the function– fernandosavio
yes, but if valid_urls is an array or is it not supposed to go through the whole array ? is almost right, by Postman the function is correct returning a json with the occurrence of found words, however it is only returning the first url.
– Lucas Latorre
The moment you give one
returnwithin a for, it exits the execution of the for. You would have to create a list with the result you want and, outside of the for, returns this list– fernandosavio
Thank you very much friend, it worked!! you helped me too much, follow the edited code running 100% and showing all Urls.
– Lucas Latorre
Glad it worked! You can mark the answer as correct if it helped you. D
– fernandosavio