0
Guys I have the following problem and I would like help to turn a log file into a key and value dictionary for later use more I’m stuck on the code.
I have the following data and would like help to make them a key/value dictionary.
default: T (add header): [10.78/15.00] [SURBL_VERYBAD(5.00){test.abuse.dnsbl;},HTML_SHORT_LINK_IMG_1(2.00){},IP_SCORE(1.99){ip: (1.56), ipnet: 123.245.3.34/19(2.57), asn: 12876(2.45), country: FR(0.06);},HAS_INTERSPIRE_SIG(1.00){},MID_RHS_WWW(0.50){},MIME_HTML_ONLY(0.20){},BAD_REP_POLICIES(0.10){},HAS_LIST_UNSUB(-0.01){},ARC_NA(0.00){},ASN(0.00){asn:12876, ipnet:123.245.3.34/19, ipnet:123.245.3.34/19, country:FR;},DKIM_TRACE(0.00){test.com:+;},DMARC_POLICY_ALLOW(0.00){teste.com;none;},FROM_EQ_ENVFROM(0.00){},FROM_HAS_DN(0.00){},HAS_REPLYTO(0.00){[email protected];},MIME_TRACE(0.00){0:~;},PREVIOUSLY_DELIVERED(0.00){[email protected];},RCPT_COUNT_ONE(0.00){1;},RCVD_COUNT_TWO(0.00){2;},RCVD_TLS_LAST(0.00){},REPLYTO_ADDR_EQ_FROM(0.00){},R_DKIM_ALLOW(0.00){test.com:s=dkim;},R_SPF_ALLOW(0.00){+ptr;},TO_DN_NONE(0.00){},TO_MATCH_ENVRCPT_ALL(0.00){}])
And I’d like you to have the following exit:
{"default: T (add header)": "10.78/15.00",
"SURBL_VERYBAD": "5.00",
"HTML_SHORT_LINK_IMG_1": "2.00",
"IP_SCORE": "1.99",
"ip": "1.56",
"ipnet: 123.245.3.34/19": "2.57",
"asn": "2.45",
"country": "FR",
"HAS_INTERSPIRE_SIG": "1.00",
"MID_RHS_WWW": "0.50",
"MIME_HTML_ONLY": "0.20",
"BAD_REP_POLICIES": "0.10",
"HAS_LIST_UNSUB": "-0.01",
"ARC_NA": "0.00",
"DMARC_POLICY_ALLOW": "0.00",
"FROM_EQ_ENVFROM": "0.00",
"HAS_REPLYTO": "5.00"
}
Remembering that I will use a file with several lines similar to this to transform into dictionary.
This is what I have so far adapted from a solution from Stackoverflow himself, but it’s not what I need yet:
#!/usr/bin/env python
# coding: utf-8
from itertools import tee
arquivo = open('new.txt', 'r')
dados = arquivo.readline().split(',')
def pairwise(iterable):
a, b = tee(iterable)
next(b, None)
return zip(a, b)
name_map = {number: name for name, number in pairwise(dados)}
print(name_map)
Give me this way out:
{" ' qid: <48FVCS2HX2zRhRN>'": "['<[email protected]>'", " ' ip: 123.83.149.223'": " ' qid: <48FVCS2HX2zRhRN>'", " ' from: <[email protected]>'": " ' ip: 123.83.149.223'", " ' (default: T (add header): [10.78/15.00] [SURBL_VERYBAD(5.00)'": " ' from: <[email protected]>'", " 'HTML_SHORT_LINK_IMG_1(2.00){}'": " ' (default: T (add header): [10.78/15.00] [SURBL_VERYBAD(5.00)'", " 'IP_SCORE(1.99){ip: (1.56)'": " 'HTML_SHORT_LINK_IMG_1(2.00){}'", " ' ipnet: 123.83.128.0/19(2.57)'": " 'IP_SCORE(1.99){ip: (1.56)'", " ' asn: 12876(2.45)'": " ' ipnet: 123.83.128.0/19(2.57)'", " ' country: FR(0.06);}'": " ' asn: 12876(2.45)'", " 'HAS_INTERSPIRE_SIG(1.00){}'": " ' country: FR(0.06);}'", " 'MID_RHS_WWW(0.50){}'": " 'HAS_INTERSPIRE_SIG(1.00){}'", " 'MIME_HTML_ONLY(0.20){}'": " 'MID_RHS_WWW(0.50){}'", " 'BAD_REP_POLICIES(0.10){}'": " 'MIME_HTML_ONLY(0.20){}'", " 'HAS_LIST_UNSUB(-0.01){}'": " 'BAD_REP_POLICIES(0.10){}'", " 'ARC_NA(0.00){}'": " 'HAS_LIST_UNSUB(-0.01){}'", " 'ASN(0.00){asn:12876'": " 'ARC_NA(0.00){}'", " ' ipnet:123.83.128.0/19'": " 'ASN(0.00){asn:12876'", " ' country:FR;}'": " ' ipnet:123.83.128.0/19'", " 'DKIM_TRACE(0.00){teste.net.br:+;}'": " ' country:FR;}'", " 'DMARC_POLICY_ALLOW(0.00){teste.net.br;none;}'": " 'DKIM_TRACE(0.00){teste.net.br:+;}'", " 'FROM_EQ_ENVFROM(0.00){}'": " 'DMARC_POLICY_ALLOW(0.00){teste.net.br;none;}'", " 'FROM_HAS_DN(0.00){}'": " 'FROM_EQ_ENVFROM(0.00){}'", " 'HAS_REPLYTO(0.00){[email protected];}'": " 'FROM_HAS_DN(0.00){}'", " 'MIME_TRACE(0.00){0:~;}'": " 'HAS_REPLYTO(0.00){[email protected];}'", " 'PREVIOUSLY_DELIVERED(0.00){[email protected];}'": " 'MIME_TRACE(0.00){0:~;}'", " 'RCPT_COUNT_ONE(0.00){1;}'": " 'PREVIOUSLY_DELIVERED(0.00){[email protected];}'", " 'RCVD_COUNT_TWO(0.00){2;}'": " 'RCPT_COUNT_ONE(0.00){1;}'", " 'RCVD_TLS_LAST(0.00){}'": " 'RCVD_COUNT_TWO(0.00){2;}'", " 'REPLYTO_ADDR_EQ_FROM(0.00){}'": " 'RCVD_TLS_LAST(0.00){}'", " 'R_DKIM_ALLOW(0.00){teste.net.br:s=dkim;}'": " 'REPLYTO_ADDR_EQ_FROM(0.00){}'", " 'R_SPF_ALLOW(0.00){+ptr;}'": " 'R_DKIM_ALLOW(0.00){teste.net.br:s=dkim;}'", " 'TO_DN_NONE(0.00){}'": " 'R_SPF_ALLOW(0.00){+ptr;}'", " 'TO_MATCH_ENVRCPT_ALL(0.00){}])'": " 'TO_DN_NONE(0.00){}'", " ' len: 2351'": " 'TO_MATCH_ENVRCPT_ALL(0.00){}])'", " ' time: 235.999ms real'": " ' len: 2351'", " ' 44.667ms virtual'": " ' time: 235.999ms real'", " ' dns req: 39'": " ' 44.667ms virtual'", " ' digest: <accefae22f4a22bfc94217189668f964>'": " ' dns req: 39'", " ' rcpts: <[email protected]>'": " ' digest: <accefae22f4a22bfc94217189668f964>'", " ' mime_rcpts: <[email protected]>\\n": " ' rcpts: <[email protected]>'"}
Leandro, it would be possible to change the structure of this data in some way ? As it is there is no pattern to be followed.
– JeanExtreme002
It would be much easier if your data had a set pattern or were stored in a json. The only solution I can see is to manually parse each field, but that would make your code very large and certainly not what you want.
– JeanExtreme002
So @Jeanextreme002 unfortunately there is no way because it is a structure defined by the application. I really also only see this alternative, but it is not really what I wanted because it is not an elegant option. I’m racking my brain for this! Another option would be to turn scores into list and column tags into a csv which would also be viave.
– Léo B