Python data extraction and automatic email sending with information obtained

Asked

Viewed 1,336 times

1

Friends,

The data extraction part is working and the email sending is also in part.

I would like the same information that I print on screen and with the same formatting (skipping line etc)) to be sent as the part of the email message. I would like to record the same that was printed on screen above and then send by email as body of the message email and no attachment.

The idea would be to throw the information in the.txt list file and then copy it to the email body. The part of picking up the same one that was printed on screen and playing as body of the message email is what doesn’t work. Could help?

Another question: how to modularize the program below in 2 files, for example? One with the part of extracting the information from the site and the other with sending email?

import os
import smtplib
from email import encoders
from email.mime.base import MIMEBase
from email.mime.multipart import MIMEMultipart
###########################################################

import requests, time
from bs4 import BeautifulSoup as bs
from datetime import datetime


url = "http://www.purebhakti.com/resources/vaisnava-calendar-mainmenu-71.html"

url_post = 'http://www.purebhakti.com/component/panjika'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64)'}
payload = {'action': 2, 'timezone': 23, 'location': 'Rio de Janeiro, Brazil        043W15 22S54     -3.00'}

req = requests.post(url_post, headers=headers, data=payload)
soup = bs(req.text, 'html.parser')
eles = soup.select('tr td')
dates = (' '.join(d.select('b')[0].text.strip().split()) for d in eles if d.has_attr('class'))
#events = (' '.join(d.text.split()) for d in eles if not d.has_attr('class'))
events = ((d.text) for d in eles if not d.has_attr('class'))
calendar = dict(zip(dates, events))

#data_hoje = time.strftime("%d %b %Y", time.gmtime() ) #data de hoje
data_desejada=time.strftime("%d %b %Y", time.gmtime(time.time() + (3600 * 24 * 2))) # daqui a 2 dias
print ("Prezados devotos, ")
print()
print("No dia %s, teremos o(s) seguinte(s) evento(s) no Calendario Vaisnava: " %(data_desejada))
print()
if(data_desejada in calendar):
    print(calendar[data_desejada],end = "" )
else:
    print('nenhum evento para hoje')
print()
print("Para mais detalhes acessem: %s " %(url))
print()
print("Jay Radhe!")



# esta parte nao funciona
#Gostaria de gravar o  mesmo que foi impresso em tela acima e depois enviar #por email como mensagem e não anexo

##arq = open('/home/gopala/Desktop/lista.txt', 'w')
##texto = """
##Prezados devotos,
##
##No dia %s, teremos o(s) seguinte(s) evento(s) no Calendario Vaisnava:  %(data_desejada))
##
##"""
##arq.write(texto)
##
##arq.close()
##



####parte envio email
COMMASPACE = ', '

def main():
    sender = '[email protected]'
    gmail_password = 'senhalegal'
    recipients = ['[email protected]']

    # Create the enclosing (outer) message
    outer = MIMEMultipart()
    outer['Subject'] = 'data no calendario Vaisnava'
    outer['To'] = COMMASPACE.join(recipients)
    outer['From'] = sender
    outer.preamble = 'You will not see this in a MIME-aware mail reader.\n'

    # List of attachments
    attachments = ['/home/gopala/Desktop/16839680_10212563027937627_634163502_n.jpg','/home/gopala/Desktop/lista.txt']

    # Add the attachments to the message
    for file in attachments:
        try:
            with open(file, 'rb') as fp:
                msg = MIMEBase('application', "octet-stream")
                msg.set_payload(fp.read())
            encoders.encode_base64(msg)
            msg.add_header('Content-Disposition', 'attachment', filename=os.path.basename(file))
            outer.attach(msg)
        except:
            print("Unable to open one of the attachments. Error: ", sys.exc_info()[0])
            raise

    composed = outer.as_string()

    # Send the email
    try:
        with smtplib.SMTP('smtp.gmail.com', 587) as s:
            s.ehlo()
            s.starttls()
            s.ehlo()
            s.login(sender, gmail_password)
            s.sendmail(sender, recipients, composed)
            s.close()
        print("Email sent!")
    except:
        print("Unable to send the email. Error: ", sys.exc_info()[0])
        raise

if __name__ == '__main__':
    main()

1 answer

3


Answer 1: String formatting

You could use a format string and leave only spaces for variables along with line breaks. For example.

string_envio = "Prezados devotos, "
string_envio += "\n"

if(data_desejada in calendar):
    string_envio += "No dia {}, teremos o(s) seguinte(s) evento(s) no Calendario Vaisnava: "
    string_envio += "\n"
    string_envio += calendar[data_desejada]
else:
    string_envio += "nenhum evento para hoje"

string_envio += "\n"
string_envio += "Para mais detalhes acessem: {}"
string_envio += "Jay Radhe!".format(url,data_desejada)

With this the string would already have all line breaks and only spaces with variables.

Simply pass as message , without the need to create an attachment.

Answer 2. Modularization

Modules : Search , Sending

Here I could use both function and classes; I will choose functions.

Email file.py:

conexao = conectar_email(login,senha) # retornando um objeto de login pronto para enviar emails
conexao.enviar(para=email_destino,titulo=titulo_da_mensagem,corpo=mensagem _formatada_anteriormente)

Leaving all the complexity of connection within the functions.

File search.py

def request(url,options)

Options being a general dictionary with headers that are passed and the function returning the html of the requested page.

def find(html,element_procurado)

Here would pass all the html read by request and also the desired element inside, returning the desired html parde.

def parse(element) 

Here would be done all the Crapping really , but only the sought element , since it would have gone through a whole previous processing. And it would return any data structure you wanted, for example a list of the given, a dictionary with the key being the day and the values the events.. then you decide.

Using like this:

requisicao = request(sua_url,headers)
elemento = find(requisicao,'tr td')
conteudo = parse(elemento) # retornando a estrutura.

And finally throwing this content inside the string each one in its places and passing to the send function in the body parameter.

Eliminating ( at least now ) the use of attachment libs.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.