It’s not that simple: when JSON is still a string, i.e., srialized - you can use find (it’s a Python string method), and find the position of the characters you want - but that way you don’t know anything about the JSON structure, you won’t know if the "ma" that you found is from "Camargo" that is inside show. [artists]. name or if it is in a word that is key to a dictionary within json. '{"theme": "setaneja music"} would result in the word "theme".
The correct thing is to load the jso u in the form of a data database, and then have a function that looks recursively, in the whole tree, for the Patterns you request. A function like this can return you the entire record (or only the value) -
These two functions can help you: the first one returns all occurrences where a "match" is found for what you search for - each one associated with a "path" - that having lets you know where in JSON the occurrence was found. She is a Generator, so use with a for
or pass your call as an argument to list
.
The second allows fetching snippets of JSON using the "path" that is returned by the first function:
import re
def json_find(data, pattern, path=()):
if isinstance(data, (str, float, int, bool, type(None))):
if re.findall(pattern, str(data)):
yield data, path
elif isinstance(data, list):
for i, item in enumerate(data):
yield from json_find(item, pattern, path=path+(i,))
elif isinstance(data, dict):
for i, (key, value) in enumerate(data.items()):
yield from json_find(value, pattern, path=path+(key,))
else:
raise TypeError("Can't search patterns in instances of {}".format(type(data)))
def get_json_item_at(data, path):
if not path:
return data
return get_json_item_at(data[path[0]], path[1:])
And in interactive mode, if I put data like your example in variable "a", I can do:
In [141]: list(json_find(a, "Mariano"))
Out[141]:
[('Cesar Camargo Mariano e Helio Delmiro', (0, 'artist')),
('Cesar Camargo Mariano', (0, 'tracks', 0, 'author'))]
In [142]:
The output indicates that the word "Mariano" was found in two places - one at position 0 of the original list, and within that at key "Artist", and the second occurrence at position 0 of the list, key "Tracks", within that at position 0, and key "Author".
The function that I’ve put up as a gift allows you to, for example, from the location in the "Author" key to be able to "climb up the tree" until you get to the record information.
Using the full path, I have only the string where the match occurred:
In [142]: get_json_item_at(a, (0, 'tracks', 0, 'author'))
Out[142]: 'Cesar Camargo Mariano'
But if I remove the last items from the path, I can get the complete record:
In [143]: get_json_item_at(a, (0, 'tracks'))
Out[143]: [{'author': 'Cesar Camargo Mariano', 'time': "5'04", 'type': 'track'}]
Use the function
json.loads
to convert a string JSON for a Python object, which considering the section presented, will be a list of Dict. This search should occur in all dictionary values or only in Artist?– Woss
Dear friend, you should return at least the name of the artist when you type one or more letters. For example: when typing CE, the return would be Cesar Camargo Mariano (I don’t remember how to implement this). As for json.loads, OK. Another possibility would be this: jdata = json.loads(s) for Artist in jdata: for key, value in Artist.iteritems(): print key, value
– Rafael Fedozzi da Silva