How to avoid large SELECT memory stocking in Rails?

Asked

Viewed 138 times

2

In the routine I’m developing, I do a simple SELECT in the bank, only in a table:

Model.select("id").where(:tipo => 2).find_each do |registro|
    puts registro.id
end

But this select returns around 160,000 records. Then the system gives the error:

pid 258 SIGKILL (Signal 9)

If I comment on this line and follow my code, everything runs normally. I already researched it and started using the function find_each in place of each, but the error continued. If I limit the query, it also works normally.

As I understand it, the error is only due to the volume of data that is larger than the memory can support. Analyzing http://guides.rubyonrails.org/active_record_querying.html#retrieving-Multiple-Objects-in-batches and https://www.webascender.com/blog/rails-tips-speeding-activerecord-queries/, I observed that the find_each improves this situation, but it did not help. How do I solve this?

  • Go to the command line in your Rails project folder and type rails console. It will open a terminal similar to the irb. There you spin Model.select("id").where(:tipo => 2) and see the SQL query being generated. Edit the question by entering this query.

  • How much memory do you have? Which S.O ?

1 answer

1

I believe you need the method find_in_batches. It breaks your query into queries with shorter record intervals. This way you don’t overload your memory with Rails trying to map all the lines in memory objects.

Example:

Model.select("id").where(:tipo => 2).find_in_batches do |registro|
  puts registro.id
end

By default he searches 1000 records. Usually this amount is quiet but if you have the same problem with this amount you can customize by setting the option batch_size.

Model.select("id").where(:tipo => 2).find_in_batches(batch_size: 500) do |registro|
  puts registro.id
end

Check the API for details

Browser other questions tagged

You are not signed in. Login or sign up in order to post.