1
I am trying to import a . json file with the following structure:
short_description:She left her husband. He killed their children. Just
another day in America.
headline:There Were 2 Mass Shootings In Texas Last Week, But Only 1 On TV
date:2018-05-26
link:https://www.huffingtonpost.com/entry/texas-amanda-painter-mass-shooting_us_5b081ab4e4b0802d69caad89
authors:Melissa Jeltsen
category:CRIME
But apparently json is not formatted properly (the file is here), then nay I could import using pandas like this:
df = pd.read_json('../input/news-category-dataset/News_Category_Dataset.json', lines=True)
I got it this way:
data = []
for line in open("News_Category_Dataset.json",'r'):
data.append(json.loads(line))
But from what I understand, this way it’s like any file and the json structure is lost (is that right?), so I wanted to understand if the structure is really wrong, if you have to read with Pandas anyway and/ or if reading as file has to manipulate easily.
EDIT: a larger chunk of the file
{"short_description": "She left her husband. He killed their children. Just another day in America.", "headline": "There Were 2 Mass Shootings In Texas Last Week, But Only 1 On TV", "date": "2018-05-26", "link": "https://www.huffingtonpost.com/entry/texas-amanda-painter-mass-shooting_us_5b081ab4e4b0802d69caad89", "authors": "Melissa Jeltsen", "category": "CRIME"}
{"short_description": "Of course it has a song.", "headline": "Will Smith Joins Diplo And Nicky Jam For The 2018 World Cup's Official Song", "date": "2018-05-26", "link": "https://www.huffingtonpost.com/entry/will-smith-joins-diplo-and-nicky-jam-for-the-official-2018-world-cup-song_us_5b09726fe4b0fdb2aa541201", "authors": "Andy McDonald", "category": "ENTERTAINMENT"}
{"short_description": "The actor and his longtime girlfriend Anna Eberstein tied the knot in a civil ceremony.", "headline": "Hugh Grant Marries For The First Time At Age 57", "date": "2018-05-26", "link": "https://www.huffingtonpost.com/entry/hugh-grant-marries_us_5b09212ce4b0568a880b9a8c", "authors": "Ron Dicker", "category": "ENTERTAINMENT"}
{"short_description": "The actor gives Dems an ass-kicking for not fighting hard enough against Donald Trump.", "headline": "Jim Carrey Blasts 'Castrato' Adam Schiff And Democrats In New Artwork", "date": "2018-05-26", "link": "https://www.huffingtonpost.com/entry/jim-carrey-adam-schiff-democrats_us_5b0950e8e4b0fdb2aa53e675", "authors": "Ron Dicker", "category": "ENTERTAINMENT"}
{"short_description": "The \"Dietland\" actress said using the bags is a \"really cathartic, therapeutic moment.\"", "headline": "Julianna Margulies Uses Donald Trump Poop Bags To Pick Up After Her Dog", "date": "2018-05-26", "link": "https://www.huffingtonpost.com/entry/julianna-margulies-trump-poop-bag_us_5b093ec2e4b0fdb2aa53df70", "authors": "Ron Dicker", "category": "ENTERTAINMENT"}
{"short_description": "\"It is not right to equate horrific incidents of sexual assault with misplaced compliments or humor,\" he said in a statement.", "headline": "Morgan Freeman 'Devastated' That Sexual Harassment Claims Could Undermine Legacy", "date": "2018-05-26", "link": "https://www.huffingtonpost.com/entry/morgan-freeman-devastated-sexual-misconduct_us_5b096319e4b0802d69cba298", "authors": "Ron Dicker", "category": "ENTERTAINMENT"}
{"short_description": "It's catchy, all right.", "headline": "Donald Trump Is Lovin' New McDonald's Jingle In 'Tonight Show' Bit", "date": "2018-05-26", "link": "https://www.huffingtonpost.com/entry/donald-trump-mcondalds-tonight-show_us_5b093561e4b0fdb2aa53daba", "authors": "Ron Dicker", "category": "ENTERTAINMENT"}
{"short_description": "There's a great mini-series joining this week.", "headline": "What To Watch On Amazon Prime That\u2019s New This Week", "date": "2018-05-26", "link": "https://www.huffingtonpost.com/entry/amazon-prime-what-to-watch_us_5b044625e4b0c0b8b23ec14f", "authors": "Todd Van Luling", "category": "ENTERTAINMENT"}
{"short_description": "Myer's kids may be pushing for a new \"Powers\" film more than anyone.", "headline": "Mike Myers Reveals He'd 'Like To' Do A Fourth Austin Powers Film", "date": "2018-05-26", "link": "https://www.huffingtonpost.com/entry/mike-myers-reveals-he-wants-to-do-a-fourth-austin-powers-film_us_5b096198e4b0802d69cb9f15", "authors": "Andy McDonald", "category": "ENTERTAINMENT"}
{"short_description": "You're getting a recent Academy Award-winning movie.", "headline": "What To Watch On Hulu That\u2019s New This Week", "date": "2018-05-26", "link": "https://www.huffingtonpost.com/entry/hulu-what-to-watch_us_5b0445bae4b0c0b8b23ec046", "authors": "Todd Van Luling", "category": "ENTERTAINMENT"}
{"short_description": "The pop star also wore a \"Santa Fe Strong\" shirt at his show in Houston.", "headline": "Justin Timberlake Visits Texas School Shooting Victims", "date": "2018-05-26", "link": "https://www.huffingtonpost.com/entry/justin-timberlake-visits-texas-school-shooting-victims_us_5b098161e4b0fdb2aa54167e", "authors": "Sebastian Murdock", "category": "ENTERTAINMENT"}
{"short_description": "The two met to pave the way for a summit between North Korean and the U.S.", "headline": "South Korean President Meets North Korea's Kim Jong Un To Talk Trump Summit", "date": "2018-05-26", "link": "https://www.huffingtonpost.com/entry/south-korean-president-meets-north-koreas-kim-jong-un_us_5b094ebae4b0fdb2aa53e504", "authors": "", "category": "WORLD NEWS"}
{"short_description": "The revolution is coming to rural New Brunswick.", "headline": "With Its Way Of Life At Risk, This Remote Oyster-Growing Region Called In Robots", "date": "2018-05-26", "link": "https://www.huffingtonpost.com/entry/remote-oyster-growing-region-called-in-robots_us_5b083658e4b0fdb2aa53415d", "authors": "Karen Pinchin", "category": "IMPACT"}
{"short_description": "Last month a Health and Human Services official revealed the government was unable to locate nearly 1,500 children who had been released from its custody.", "headline": "Trump's Crackdown On Immigrant Parents Puts More Kids In An Already Strained System", "date": "2018-05-26", "link": "https://www.huffingtonpost.com/entry/immigrant-children-separated-from-parents_us_5b087b90e4b0802d69cb4070", "authors": "Elise Foley and Roque Planas", "category": "POLITICS"}
{"short_description": "The wiretaps feature conversations between Alexander Torshin and Alexander Romanov, a convicted Russian money launderer.", "headline": "'Trump's Son Should Be Concerned': FBI Obtained Wiretaps Of Putin Ally Who Met With Trump Jr.", "date": "2018-05-26", "link": "https://www.huffingtonpost.com/entry/fbi-wiretaps-putin-ally-trump-jr_us_5b08bf56e4b0568a880b7859", "authors": "Michael Isikoff, Yahoo News", "category": "POLITICS"}