What do the terms Memory Poisoning and HWPOISON mean?

Asked

Viewed 95 times

9

I’m studying some things in the kernel GNU/Linux and found the above terms in the Memory Management/ Memory Allocation, would like to know the meaning of both and what they are ?

1 answer

4


Let’s go in pieces :)

Why we have them?

Back in 2009, Intel included in its processors a new engine called Machine Check Architecture Recovery (or commonly called MCA Recovery).

This new mechanism was intended to make it possible for the processor to report hardware failures to the operating system.

The operating system in turn, upon receiving a processor failure signal, may act (or not) to deal with the error received and not present a Panic kernel

Where does HWPOISON come in?

The HWPOISON is a Handler error of Linux distributed via patch, necessary for handling errors pointed by the processor and treating them accordingly. These errors are received through a Memory Check Exception.

The exception is handled by the operating system and if the value found in the registers is related to the MCA and is recoverable, it is passed to the HWPOISON.

A summary taken from the article:

HWPOISON is a Poisoned data Handler Invoked by the low-level Linux machine check code. Where possible, HWPOISON Attempts to gracefully Recover from memory errors, and contain Faulty hardware to Prevent Future errors

Memory Poisoning is still missing

So let’s think about this, we have several processes running in the system, each with its managing its objects in memory.

When the CPU sends a memory related error to the OS, it must somehow understand what went wrong, communicate the processes that are currently using some memory region that is experiencing a problem, and ignore the region so that it is not actually used.

This is called Memory Poisoning, where the OS marks the memory page as "poisoned", searches all the processes that are related to the page and treats the memory accordingly so that the processes are not closed, processes relationship with these poisoned pages is removed and pages are removed.

In the case of Linux this is the work of HWPOISON :)

To Recover from Poisoned, user-Mapped pages, HWPOISON first finds all user processes which Mapped the corrupted page. For clean pages with backing store, HWPOISON need not take Recovery action Since the process does not need to be Killed. Dirty pages are unmapped from all Associated processes, which are subsequently Killed.

The HWPOISON source code can be seen here

References

Memory Check Architecture Recovery

Memory Check Exception

HWPOISON Discussion

HWPOISON Article

HWPOISON Repositório

Memory Page

Intel Architecture Manual

  • Section 16.11.1.1 Processor Machine Check Status Register MCA Error Code Definition shows how errors are treated internally in the processor
  • 1

    very well answered, thank you for giving me a north.. I will give a deeper reading in references , I swear I did not imagine that these terms would mean this :)

Browser other questions tagged

You are not signed in. Login or sign up in order to post.