Insert documents into Mongodb via pymongo

Asked

Viewed 594 times

0

Good evening. I am using python for the first time in order to run a Crawler. I’ve been able to rotate and take the acquisitions and I want to save them on Mongodb via pymongo. I tried to follow the official documentation but for some reason I’m not getting it. Does anyone know how to insert or have ever done something like this? Hugs.

import scrapy
import pymongo
from pymongo import MongoClient


class NameSpider(scrapy.Spider):
    name = 'SpiderName'
    allowed_domains = ['randomDomain']
    start_urls = ['randomDomain Url']

    def parse(self, response):
        data = []
        for selector in response.css("span.style_data"):
            data.append(selector.css("::text").extract()

        print(data)

# O data aparece como desejado,agora desejo salvar seu conteudo no MongoDB.

1 answer

1


Hello, we are missing information about your problem, but I will try to help in the best way possible.

First of all it would be interesting if you already understood the concepts of a non-relational bank. In Mongodb we basically have: collections and documents, and in short:

  • collections: group of documents stored (in comparison very generic would be similar to table in a relational database),
  • documents: how to store the data itself, in Mongodb the documents are stored in the format JSON (in the case of pymongo are used dicionários to represent documents).
Basic operations of pymongo:
  • Create a client for connection and connect to a comic:

    from pymongo import MongoClient

    client = MongoClient()

    #conectar a um bd local

    client = MongoClient('localhost', 8000) or
    client = MongoClient('mongodb://localhost:8000')

  • Access to a database:

    banco = client.crawler_db or banco = client['crawler_db']

  • Access a specific collection:

    colecao = banco.dados_crawler or colecao = banco['dados_crawler']

    Obs: collections and databases are created from the moment the first document is inserted!

  • Doctoring:

    doc_exemplo = { "dado1" : 123, "dado2" : "teste_bd" }

This is the expected format of a Mongodb document (JSON format).

  • Inserting a document:

    dados_crawler = banco.dados_crawler

    resultado = dados_crawler.insert_one(doc_exemplo)

  • Inserting several documents:

    resultado = dados_crawler.insertMany([doc_exemplo, doc_exemplo2])

Not to extend further, I believe this is enough to solve your problem.

Below are references to documentation/tutorials with examples:

  1. Basic Tutorial - english
  2. Introduction to Mongodb - English
  • Thank you very much! I wrote with little information because I don’t know much about python and Mongodb (or about programming in general) but with his post I was able to understand more about the database and I was able to do what I wanted. Hugs

Browser other questions tagged

You are not signed in. Login or sign up in order to post.