Calling a Python function that is already running

Asked

Viewed 528 times

0

I have the following scenario.

I have a function that consumes a lot of RAM and I would like to let it run in the background for every time I call her not to have to reload all the data in memory, and when I need your result I perform her call.

In case I need to inform a parameter and it should return me a result, how to do this in Python?

Remembering that the one that will process I need it to be running and when the other is communicating with it is just call this module that is already running.

****Good as you said let me exemplify better.*****

I have a file of 1.5 GB that contains several texts and I upload it in memory, and perform query in these texts. What I don’t want is that every query I have to upload this file.

I want it to run in the background while I navigate the other screens I have (I’m using Django, I created several pages and one of these pages allows me to query this file) because several screens will use this file differently, then I would like to have a process running always and when I need a return I call this process and it returns what I wish

1 answer

0


You’re practically describing one of the advantage sets of using object-oriented programming.

Although in Python it is possible yes, to have a "function" that has a state (the data) loaded and you can, from outside it, have a chunk executed and receive a value back, with the use of "generators", this is not a very common use for functions.

For objects, however, it is quite natural. Unfortunately you have not given example of what you want to do. Of course, if the problem is time consuming to read or produce the data, or not having repeated copies of the data, this method helps. If just one copy of the data already brings you memory problems, there is not much to do.

Let’s assume that what you need is to read data from an Excel file, and then have a call that queries data that was on that spreadsheet: you can have a class that reads the file on the initializer (the method __init__, and stores the data in memory, and a method that queries a given cell:

import pandas as pd

class Planilha:
    def __init__(self, caminho_do_arquivo):
         self.dados = pd.read_excel(caminho_do_arquivo)

    def consulta(self, linha, coluna):
         return self.dados.loc[linha, coluna]

And you create an instance of the spreadsheet class, which reads the file once, and at each data query, you can call the "query" method. (This example is quite crude, since you could use the object returned by .read_excel, that is a Dataframe pandas directly, and wouldn’t even need the spreadsheet class - but if you need more specialized processing on top of the data contained in the spreadsheet, it already makes sense a design of these.

Anyway, everything I’ve described so far applies to code that runs to initialize the data - and returns control to you, but keeps a reference to this data initialized.

If this is really how you describe this code having to "run in the background" and only respond when called (maybe with a partial result?), then that’s another story - you’ll need to use separate threads or processes. But you can’t tell without an example. It may be that "what keeps running" is better if it is a "worker" process adding data to a database, and you can use a normal function just to query this database. The ideal is to put in a next question the data you have, what you want, and what approaches you have tried, with same code.

  • I edited my question with more details

  • If it is your use case, then the example I gave, class is the answer. Just instead of reading the spreadsheet with the call to Pandas, you read your text file.

  • but the ideal solution there is to put this data in an SQL database, instead of leaving it in memory: then you don’t consume all this memory just to speed up access to data. Leave it for a second step: take advantage now to study classes.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.