PSYCOPG2 - Pagination

Asked

Viewed 470 times

2

In the search for how to make paging using psycopg2 I discovered the attributes cursor itersize. and cursor., and the methodscursor.fetchmany() and scroll cursor.()

I wanted to check with the staff here if my understanding is correct about these methods and attributes. fetchmany() returns many lines, whose quantity can be passed as parameter. scroll() from what I understand serves to scroll the cursor to some position among the lines loaded by the query, but I did not understand the parameter it receives. Already the attribute itersize as I understand it, it controls the number of lines the cursor should receive from the bank. Perhaps behind the driver is making use of the parameter LIMIT postgresql which limits the number of lines that must be returned by the query. The attribute arraysize sets the number of lines that will be returned by the method fetchmany(). I did not understand the purpose of this attribute in the object being that the method fetchmany() already receives a parameter to define this.

I need to display the rows of a table in a Gui table component, from some graphical api I haven’t decided on yet. But I think it is more appropriate that I make use of these attributes and methods to create a pagination, as creating an object cursor with all data from a table that has many records can consume many resources.

1 answer

1


in Python we use a maxim: "Premature Optimization is the root of all evil": You don’t know, and there’s no way to know a priori, before your system works if you’re gonna have a memory bottleneck or I/O there.

Moreover, you are using a database connector that has been developed for several years, and has great potential to use best practices in Python, which involve yes, search results in a "Lazy" way - that is, search results only as they are needed.

So the recommendation is to believe that the standard parameters of fetchmany will be good in most cases, and use your results in the recommended way in Python: as an iterator.

for result in cursor.fetchmany():
      # your code here

This way, you leave the job of bringing an optimal amount of records to the connector, and work with one result at a time. Once the code is ready, you can benchmark, load and use memory tests to see how this is going - and then worry about these parameters.

Disclaimer: I’ve seen people have problems with fetchmany in an old version of a connector for Mysql - in that case, the internal implementation was really childish and it was much better for the person to get all the results at once, on a list of what to use an interaction like the one above. But I believe that in the case of psycopg2, the fetchmany is better.

If you want a pagination on your side, you can do this simply by creating a Python iterator, which is independent of the bank-side pagination: you can do so:

def get_paged_results(cursor, limit=50):
    while True:
        page = list(cursor.fetchmany(limit))
        if not page:
              break
        yield page

...
for page in get_paged_results(cursor):
     ....
  • So I have no preference for where to make the pagination. I thought at bank level not to generate a monstrous cursor full of records, taking into account that it will be a server client application the consumption on the network will also be great in the case of a very swollen table. You are right about the premature optimization, but I am sure that the records in this table will be many, since a new record will be inserted with each transaction. Then it will grow gradually and soon, it will have thousands of records. I wanted to leave it treated so that I do not have to move later.

  • I could make this pagination through direct SQL, but if psycopg2 has Features to do this I prefer it. That’s why I wanted to know more about these methods and attributes that I mentioned, because it seems to me that they serve precisely this. If possible with examples.

  • Special the atibuto itersize and the method scroll() because they seem to be exactly what I’m looking for.

  • Opa joão, rereading your comment and reading again the doc I think maybe I understood what you meant. From what I understand, by default psycopg2 behind 2000 bank records, number this set in attribute Cursor.itersize. When you say "search for results in a "Lazy" way - that is, search for results only as they are needed." You mean if I need more records the driver himself will take care of searching?

  • 1

    Exactly - the driver himself picks more results as you consume them (I just checked the source code of Psicopg2 and he does it himself)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.