Briefly, in Python, the basic types for manipulating binary data are bytes
and bytearray
. They, as well as arrays, for example, they are supported by the memoryview
, who uses the "buffer Protocol" to allow the access the memory of other binary objects without the need copying.
I confess that this is the first time I’ve come across this in Python, but the concept seems similar to Slices rust. I find it interesting to mention this because this kind of thing is usually interesting when working with "lower level" operations, which is the niche of Rust.
What is the advantage?
The advantage is precisely the speed and the lowest cost for access, since no copy will ever be made for access.
Because of this, you can index and perform Slices without incurring any cost, since, I repeat, there will be no copy.
Understand what memoryview returns as a "lens" that allows you to read the elements directly "looking into memory", which is not even the case with the most "common" Python mechanisms. A slicing in lists, for example, copies references.
As I said at the beginning of the question, memoryview
can be seen as an API a little more "low level" python. In most cases, it makes no difference to use a array with memoryview or a regular list. However, there are cases where the cost of copies can actually entail a large additional cost. In such cases, it is ideal to use it.
Why doesn’t it work with lists?
The API memoryview
is valid only for objects that implement buffer Protocol, what is not the case with lists.
As the value to be "involved" by memoryview must have each "element" with the same memory size, you are not able to use this with lists, for example, since each element of this structure can occupy a different amount of memory.
In your case didn’t work out because of this: you were using a list, data structure that could potentially have values with varying sizes. But when using an array of floats (as in the second example of the question), it works, since each element of the array is guaranteed to be the size of a float, that is, all array elements occupy the same amount of memory. Therefore, the memoryview will be able to make a clear and performative distinction between each element, allowing access without major costs.
Thanks for the answer. Could you give an example of this performance gain on slicing? I tried to make an example from your explanation, but the result was not as expected. See in ideone: https://ideone.com/WMPFNu
– Lucas
In this example, since the array has very few elements, the cost of "index" a list item seems to be lower at the cost of instantiating a memoryview. It is important to pay attention to the detail of early optimization (which seems to me to be the case there). Most of the time, use memoryview can end up not adding in as much value. However, for evidently more expensive operations, such as working with lists with multiple elements (i.e. this example in Stackoverflow in English), the difference is already noticeable. :)
– Luiz Felipe
A little more dramatic: https://ideone.com/jjYjib
– Luiz Felipe
Great. Perhaps it is the case to add this example in your answer for future readers of this question have a more concrete view of the advantages of
memoryview
– Lucas
Now it’s in the comments. D
– Luiz Felipe
hhahahaha ... yes. But not everyone reads comments and the ideone link might break for some reason. Anyway, it was just a suggestion.
– Lucas