What are they and where are the "stack" and "heap"?

Asked

Viewed 41,133 times

217

What are these such stack and heap that so much is said about memory management?

This is actually portions of memory as some people speak or is just an abstract concept to facilitate understanding of how memory is managed?

Is one of them faster than the other? If one is clearly faster, why is the other?

Does it matter if I’m using, for example, Assembly, C, Java, or Javascript, Windows or Linux? This is controlled by the "language" or the operating system?

Anyway, I wanted to better understand this concept that seems to be misunderstood by programmers. It would be very useful an explanation for who is starting or learned it wrong.

  • https://translate.google.com/translate?hl=&sl=fr&tl=pt&u=http%3A%2F%2Fsasfepu78.ddns.net%2Farticles%2FCours_asm_68000_Feroce%20Lapin.pdf Automatic translation of a course of ASM68 that I wrote 30 years attracts; With the help of some coffee, and misfortune this will allow you to understand how it all works. :)

4 answers

225


One stack (or stack), in this context, is an optimized way to organize data in memory allocated in sequence and abandoned (yes, there is usually no misallocation) in reverse sequence to the input.

Pilha/Stack

A heap (or mount, ok, no one translates this) is the most flexible memory organization that allows the use of any available logical area.

Monte?Heap

What pile are we talking about?

There are some very widespread stack concepts in computing, to name a few:

  • There is the rundown of some architecture where the instructions and data are being piled and after executing something there, the destacking occurs.
  • There is the function call stack, that is confused with memory management, where the functions are being called and stacked and when its execution ends, it leaves the stack.
  • There is the generic data structure which stacks various data. Example in C#.

Abstract concept

The two concepts of the question are abstract. There is no physical memory area specific to the stack (and much less its area is physically stacked) and there is no area reserved for the heap, on the contrary, it is usually rather fragmented. We use the concept to better understand the functioning and its implications, especially in the case of the battery.

Most modern and popular computer architectures do not have great facilities to manipulate this stack memory (usually has only the stack pointer register), as well as the heap, although in this case, instructions that help to manipulate virtual memory in a certain way help to organize the heap, but that goes for all memory, not only the heap.

Getting a little more concrete

The operating system is well aware of these concepts and it is essential that they possess some form, even if limited, to manipulate the memory of applications, mainly in modern and general utility systems. Modern systems have a complex management through what is conventionally called virtual memory which is also an abstract concept, often misunderstood.

Where we move directly

In Assembly or C it is very common to have contact with this memory management. In Assembly it is common to manipulate the stack almost directly and in both languages at least the allocation and misallocation of the heap should be done manually via the operating system API. In C to stack is managed by the compiler, save for any unusual operation that is required.

Nothing prevents you from using some library that abstract this manipulation, but this is only common in higher-level languages. In fact, it is very common for other languages to use the OS API internally to make memory management cumbersome but memory access in "retail" is done by a manager of their own, usually called Garbage Collector by reference counting techniques for an object in the heap (there are those who consider that this is not a technique of Garbage Collector) or subsequent verification whether there are references to the object in the heap. Even using a more abstract library, the concepts remain.

The higher the level, the less they need to manage all this, but understanding the overall workings is important in all languages.

Languages that do not need performance can leave everything on heap and "facilitate" compression and access.

Stack

Allocation

Under normal conditions, to stack is allocated at the beginning of the application execution, more precisely at the beginning of thread, even if the application only has the thread leading.

To stack is a contiguous portion of memory set aside to stack the required data while executing code blocks.

Each need for allocation is an excerpt from stack which is always used in sequence determined by a marker, that is, a pointer, a pointer, is "moving" to indicate that a new part following this reserved portion is committed.

When something reserved for a segment is no longer needed, this marker moves in the opposite direction to the data sequence indicating that some of this data can be discarded (superimposed with new data).

The allocation of each chunk of memory does not exist in the stack, is just the movement of this pointer indicating that that area will be used by some data.

Roughly speaking we can say that the application has full control over the *stack, except when the available space in it runs out.

There are features to manually change the size of stack, but this is unusual.

Functioning

The stack works used a LIFO (Last in First Out) or UEPS (Last in, first out).

The scope of a variable usually defines the allocation time in the stack. The data used as parameters and function returns are allocated in the stack. That’s why the function call stack gets confused with the memory stack.

We can say that the parameters are the first variables of a function allocated in stack. Access to data on stack is usually done directly, but there is indirect also.

alocação stack

It seemed that each thread has its own stack, right? And the size of the stack of each thread created can have its size set before creation. A value default is usually used.

To stack is considered a form automatic of allocation (often confused with static that is allocation that occurs next to the execution right in its load. Technically there is another area of memory that is actually static, which is allocated before the start of the execution. The effectively static area cannot be manipulated, cannot be written (at least it should not be possible). A stack in itself is static, although your data are not, after all they are being placed and abandoned according to their use, its management is automatic.

Decision on where to allocate

As in the heap, cannot allocate data to stack before knowing its size (you do not need to know when compiling, but when executing the allocation, but in the stack has some restrictions). But if the size is undetermined at compile time or can be determined as possibly large (perhaps a few dozen bytes), probably the allocation should occur in the heap.

High-level languages predetermine this. Others let the programmer have more control, and may even abuse the stack if useful and the programmer knows what he is doing.

Stack overflow

The famous stack overflow occurs when you try to allocate something to stack and there is no reserved space available. Also, in some cases if the language provides mechanisms that allow, there may be overflow of one dice on top of another that comes next on the stack. Uncontrolled recursive executions cause stack overflow.

Another pile

There is also a call stack which is where the addresses are stored to where the stack pointer should return when executing a function.

Heap

Allocation

The heap, unlike the stack, does not impose a model, a pattern of memory allocation. This is not very efficient but is quite flexible.

The heap is considered dynamic. In general you allocate or displace small chunks of memory, just for the need of the data. This allocation can occur physically on any free part of the memory available for your process.

Virtual operating system memory management, aided by processor instructions, helps to organize this.

In a way we can say that the stack as a whole is the first object allocated in heap.

Effectively these actual allocations usually occur in fixed-sized blocks called pages. This avoids the application making dozens or hundreds of small allocations that would fragment the memory in an extreme way and avoids calls to the operating system that changes context and is usually much slower. In general every memory allocation system allocates more than it needs and gives access to the application as it needs, in some cases it almost simulates a stack, for some time, or reorganize the memory (through a GC compactor).

Dislocation

The displacement of the heap usually happens:

  • manually (at the risk of bugs), although this is not available for some languages;
  • through the such Garbage Collector which identifies when a part of the heap is no longer needed;
  • when an application closes.

Depends on implementation

There are even languages that have Heaps specialized that may have a slightly different behavior, but let’s simplify for common cases.

Abstract concept

It is clear that the heap is not an area of memory, even conceptualizing abstractly, it is a set of small areas of memory. Physically it is usually fragmented throughout the memory. These parts are very flexible in size and lifespan.

For security reasons, it’s good to know that moving away is an abstract concept as well. It is usually possible to access data from an application even after it has finished. Content is only deleted by manual request or when an available area is rewritten.

Cost of heap

To allocation in heap "costs" expensive. Many tasks must be performed by the operating system to ensure the perfect allocation of an area to a stretch of it, especially in competing environments, very common nowadays, and even when you don’t need the OS you still have a complex algorithm to allocate. Dislocate, or make available back an area also has its cost, in some cases for the allocation cost cheaper the release costs very expensive (ironically can be controlled by several batteries).

There are even ways to avoid calling the operating system for each necessary allocation, but still the "processing cost" of this is considered high. Keeping lists (in some cases linked) of allocated areas or pages is not something trivial to the processor, at least comparing to the pointer movement that is required in the stack.

Functioning

alocação heap

The heap is accessed through pointers. Even in languages that do not exist the concept of pointers available to the programmer, this is done internally in a transparent way.

exemplo de alocação geral

Note that in the example, an object of the type class1 is allocated in the heap. But there is a reference to this object, which is allocated in stack (in some cases might not be).

This allocation is necessary because the size of the object can be too large to fit in the stack (or at least occupy a considerable part), or because it can survive longer than the function that created it.

If you were in the stack the "only" way to keep him "alive" would be to copy to the calling function, and so successively to all others, where it is necessary. Imagine how "expensive" that is. The way it is organized, only the reference, which is short, is that it needs to be copied, and this can be done only using registers, super fast.

Completion

Then the Runtime of a programming language communicates with OS to manage memory. If this Runtime is exposed to the programmer depends on the purpose of the language. In languages called "managed", all this occurs, the two concepts exist and need to be understood, but you do not have to manually manipulate the heap. He turns out to be as transparent as stack is in other lower level languages (except Assembly).

The allocation of both are usually performed in the RAM, but nothing prevents it from being allocated elsewhere. The virtual memory can place all or part of the stack or of heap in mass storage, for example.

"I stole" some images of this OS response that are very good to illustrate all this.

  • "In Assembly it is common to manipulate the stack almost directly" because "almost"? moves. w D1,-(sp) bsr my_function addq. l #2,sp Vc reads "stack Pointer". There is no other way.

77

The translation of "stack" is stack, that is, a data structure in which the last element to enter is the first one to leave (think of a stack of books). The stack therefore works quite simply - elements are added/removed in an organized/restricted way, which allows processors to be optimized to perform the operations involved (e.g. Intel operators have dedicated loggers to store the base and top-of-the-stack address). So it can be said that the stack is faster than the heap.

One concept related to the stack is "call frame". When functions are called, pointers and parameters are recorded at the top of the stack so that the called function has access to the parameters to run and then the program can continue running from the point where the function call occurred. Again, the processor can support this (call from Assembly command to 8086, for example).

I don’t remember seeing translation for the term "heap", but the main feature of the portion of memory dedicated to a program referred to as "heap" is that it is intended for dynamic memory allocation (famous alloc()/malloc()/realloc() in C/new in C++ for example). It is precisely because the elements allocated in the heap can be allocated/displaced at any time and in any order that the access to the heap tends to be slower, leading to memory fragmentation (spaces lost between memory regions used).

As I mentioned earlier, Intel processors support cell control directly. For other processors, the operating system may need to take care of this. In any case, when the program/thread is initialized, it is up to the operating system to reserve a portion of memory for the program/thread. In many cases this area is "shared" between the stack and the heap, one of them starts from the smallest address and the other starts from the largest address and the two grow toward each other. When one meets the other, an Exception outofmemory occurs or something like that.

In Assembly/C, the programmer must be aware of these differences and will be directly involved in choosing where to allocate each variable. In more "modern" languages, like Java, which has conveniences like Garbage Collection, the concepts still apply, but it is easier for a programmer to live without knowing the details of what happens under the hood.

  • 3

    +1, good answer. Only' indicate that a stack is reserved for each of the threads, while the heap is shared by all threads within the same process.

46

I would like to present here my less technical understanding than the answers given above but that can be of help to the programmer who just wants to know what it is about without delving into the subject.

Stack memory is used to store arguments of a function, precedent, method. It is pre-allocated static at the start of the program and off-set at the end, so it is faster than the Heap memory that needs to be allocated/off-set every time it is needed.

The stack memory allocation is usually made a single allocation of a large block and for each function/method/Procedure argument is assigned a part or piece of that memory. The fact that the stack memory has a fixed size specified in the project, makes it a potential source of problems if smaller size is specified than is intended to be used in the program.

So it was common to see in the past (less nowadays) the famous error messages that gives title to this site "Stack Overflow", meaning the memory stack is gone. This occurs or occurs a lot in the calls of cascading functions (a function that calls another, that calls another, that...) and also when there are recursive calls of the same function/precedent/method in which there is no end control or the end of it would use so much stack memory that it would be necessary to increase its size and recompile the program.

Nowadays it is rare to have the famous "stack overflow" due to today’s compilers reserving as default a very large amount of stack in their project, which did not occur in the past in which computers had much less memory than the current ones, so it is not possible to allocate a lot of stack in those remote times.

Heap memory is widely used for object allocation and misallocation at the end of its use. Memory in a generic concept can be static or dynamic, static (also called Global) uses Data Segment (Intel 80x86 processors), while dynamics uses Heap and that’s what differentiates one from the other.

But what’s the difference between Heap and Stack? While the stack is used by functions/methods/procedures, the heap can be used at any point in the program normally in creating objects or pointers to some data structure.

  • 1

    It is an explanation, it is not wrong, but gives "misleading" indications of how these concepts work.

  • 3

    I understand, but as I said at the beginning, I did not intend to delve into it very technically, since there were other more complete answers in the technical part. I tried to focus more on the use and how it works in the 80x88 processor.

  • It has lingage using the Processor record for parameters and other language using the "stack". Stack and Haep are piece of memory. In Pascal type language, C etc... you have limitations. In ASM you do not have and you can use memory the way you want.

41

Definitions

Stack

On the stack, allocated objects are stored within function scopes including local variables of functions, arguments, addresses of code areas being executed before other function calls, function returns.

Memory allocation occurs sequentially and, as the position of these objects is known during compile time, we can assign proper names to these objects and access them directly. When an object that is allocated in stack exits its respective scope, the object is automatically deleted. So you don’t have to worry about memory allocation and misallocation with stack objects but watch out, the stack has a limited size.

Heap

The heap is the appropriate memory location to allocate many large objects, as this section of the program is much larger than the stack, and its size is limited only by the virtual memory available on your machine. The objects allocated in the heap are all those allocated using new or malloc() (dynamically allocated objects). Since the position at which these objects will be during program execution is unknown at compile time, the only way to access them is via Pointeros. You must remember to control the displacement of these objects, as they are not automatically destroyed.

Answers

Is one of them faster than the other? If one is clearly faster, why is the other?

To Pilha (Stack) is faster because the variables/objects are created at compile time, the stack does not extend through the machine’s virtual memory (HD) so at some point an object/variable allocated in the Heap may be stored on the hard drive and then the hard drive should be loaded into the RAM.

Does it matter if I’m using, for example, Assembly, C, Java, or Javascript, Windows or Linux? This is controlled by the "language" or the operating system?

Languages such as Java and Python have a Garbage Collector that removes from memory objects and variables that are no longer referenced. As for the difference between Sos there may be one regarding addressing, this question I think can be better explained with this link to the Soen

Reference: http://www.unidev.com.br/index.php?/topic/55299-understanding them-Divis%C3%B5es-de-mem%C3%B3ria-stack-heap-global-e-code/

  • 2

    Just one detail, the position of no object is known at compile time.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.