What are error-correcting codes?

Question

What are error-correcting codes?

Asked 10 years, 1 month ago

Viewed 778 times

21

What they are and how they work error-correcting codes (error-correcting code)? I’ve seen this concept for example in QR Code (where partially damaged and/or imperfectly captured code is still readable) and I have also heard that it is used in space probes, when transmitting data over long distances with enough interference in the signal. I would like to know more about the concept, and how/where to apply it:

(Clarification: I refer only to codes, not to complete protocols. Like QR Code, once printed it cannot be "resubmitted", so any correction capability needs to be present in the code itself.)

What are the principles behind the technique? If it is something very extensive, I am satisfied with just an introduction followed by some references.
What are the limitations? Returning to the QR Code example, I saw that it is possible to configure it to tolerate up to 30% error. Is there any point from which codes become excessively long (tending to infinity, of course[1]), what is the best rate that can be expected in practice? And do these codes correct errors at arbitrary positions in the data (e.g., beginning, middle, end, 1 bit every 8, any combination of the above, etc.) or just some specific error types? It is possible to code "fix" to something other than the original data?
In what scenarios is this used in practice? There are libraries ready to do this, or you need to implement by hand, case by case (depending on the particularities of each application)?
- Clarifying: I am not asking for a "list of cases", just know in general lines if it is worth using this kind of thing in day-to-day, or only in exceptional circumstances (I note that in practice if it is communication error, relays, if it is storage error, it is used backups; I don’t see this kind of thing being discussed much, not even mentioned).

Note: I am already familiar with codes detectors of errors (such as parity, checksums, hashes, Macs, etc.), I am interested in codes that allow to recover the original data even in the presence of some degree of error in them. Of course, to correct an error first you have to detect it, but my common sense says that the traditional mechanisms mentioned above are not enough - after all, the tag check can also be transmitted with errors...

[1]: When I say "tending to infinity" I mean the fact that it is impossible to correct 100% wrong code unless it is infinite. My question is whether by increasing the level of error tolerance, whether the code size grows at a high rate (e.g., quadratic, exponential...) or at a lower rate (e.g., linear, logarithmic) - allowing for example to achieve tolerances of 90% or more.

You’ve already managed to clear up your doubt?

– gato

2016/06/10 at 20:23
@drmcarvalho It’s been a while since I asked this question, and I’ve forgotten the subject hehe! I have actually had some contact with the theoretical part (although I still have some difficulty understanding it), but I am completely lost with the practical part. I will give a "Bump" in the question, suddenly someone who knows more of the subject manifests...

– mgibsonbr

2016/06/10 at 21:44
I wrote my own QR encoder, but it made me want to cry when I read the documentation. For the polynomial part I was cutting way in everything that is given place (using pre-calculated tables, and simplifying in some cases with array to index the powers). I knew they had complicated in generation to simplify reading, but exaggerated :) - I hope someone will address the subject in the answers, now that has the Bounty, because despite being a boring thing to implement, it is very rich in details.

– Bacco

2016/06/10 at 21:56
2

On the last question, in general these techniques are used when the communication channel is noisy (radio waves, bar code images, etc.). For non-noisy channels (such as a network connection) it usually is OK to do a simple checksum and resend the data if you have a problem. BTW to wkipédia page about Forward Error Correction has more details than the one linked in the question.

– hugomg

2016/06/10 at 22:04
Melgibsonbr, kkk, the Computer Network of Tanenbaum, addresses this subject in cap 3, if I understand your question correctly... It’s very interesting, but I’m not in a position to explain... By the way, I’m gonna get a call, see if anybody’s qualified, 'cause it’s more binary math than high-level code... You can get an idea-https://en.wikipedia.org/wiki/Cyclic_redundancy_check

– MagicHat

2016/06/10 at 22:06
@Magichat CRC is much simpler, it’s more for data conferencing. Error-Correction in general has redundancy of information, because the goal is for you to be able to recover 100% of the original information without retransmission, even losing part of the data.

– Bacco

2016/06/10 at 22:09
@Bacco Crcs are based on the Theory of Cyclic error-correcting codes, this put, I imagine that to delve into something so complex, it is necessary a base... But in the book I mentioned, other terms too... Including data retrieval ... It’s worth taking a, "dig"...

– MagicHat

2016/06/10 at 22:14
@mgibsonbr is a good question, but do not know the subject to give an answer, but I will wait for someone to answer, this will definitely help the community a lot.

– gato

2016/06/10 at 22:18
@mgibsonbr I’ll try, do a 10 ai.... Necessary soundtrack... https://www.youtube.com/watch?v=ZSAsI5pnAbc

– MagicHat

2016/06/10 at 22:30
Here’s a really cool question. : ) Looking forward to seeing it unfold. Meanwhile, in the courtroom: https://www.youtube.com/watch?v=KA8hDldvv0

– Luiz Vieira

2016/06/10 at 22:39
Peri I’m going to need a little more time.... https://www.youtube.com/watch?v=-jyTinxcKDg

– MagicHat

2016/06/10 at 23:47
@mgibsonbr I hope that you or any other user did not interpret the "kkk" of my first comment as being sarcastic to your question... I’m actually laughing at your nick Mel Gibson BR ! x)

– MagicHat

2016/06/11 at 07:36
@Magichat It’s funny that said whose is probably the reason of my nick - but not the way it looks! When I created my email on yahoo.com.br (yes, I am old rsrs) I wanted "Gibson", as my friends call me, but it was already used, so I put "mgibson". When I switched pro gmail.com, this was already used, so without thinking too much (after all when this "Google" will surpass the almighty Yahoo?) I passed the "br" I didn’t have in the domain pro username. Now that everyone knows me like this, I’ve been "cursed" with this bizarre nickname for the rest of my life! : P

– mgibsonbr

2016/06/11 at 17:54
@mgibsonbr heheh...is part, but the said whose is legal, demonstrates acumen, agility, do not feel cursed..... kkkk

– MagicHat

2016/06/11 at 18:19

Show 9 more comments

1 answer

Browser other questions tagged mathematics serialization error-recovery content-coding

You are not signed in. Login or sign up in order to post.

by MagicHat • **12,262** points · Answer 1 · 2016-06-11T07:31:16+00:00

_(ツ)_/

Error detection and correction (Error Detection and Correction)

What are and how error-correcting codes work ?

Definition

They are techniques that allow the delivery of data more reliably through communication channels.

Introducing

Detection and Correction, are terms that walk together being impossible the application of the second without the effectiveness of the first.

Communication channels at the physical level generally are susceptible to sensitive failures, often causing data loss or fragmentation.

"Functioning, what are the principles behind the technique, it is possible to code "correct" for something other than the original data ?"

There are 2 best known methods for error correction:

Repeat request automatic (ARQ - Automatic repeat request):

Uses "acknowledgements" (confirmations via messages sent by the receiver indicating that it has correctly received a data frame or package ) and "timesout" (time periods preceding a confirmation of receipt) to perform a reliable data transmission. If the sender does not receive a confirmation from the receiver before the "timeout" completes its cycle, the packet is retransmitted until the source receives a confirmation or exceeds a predefined number of attempts.

acknowledgements

timeout

There are 3 main protocols using the ARQ method.

All are based on Sliding window Protocol.

2- The second method is Forward error Correction (Advanced error):

The sender redundantly encodes the code using an error correction code (ECC), through the redundancy added to the code the receiver recovers the original data in general the reconstructed data are considered "most likely" original data. The basis for this method is the work of Richar Hamming, author of hamming code and Hamming code(7,4).

The 2 main methods (ARQ and FEC) when combined fix small errors and avoid retransmission and major errors are fixed through 1 relay request : this process is called (HARQ) Hybrid Automatic repeat request.

The main principles that form part of the basis for implementation are:

the Reed-Solomon code set
BCH code
MDPC
Euclid’s algorithm and its extent Euclid’s Algorithm Extended
among others.

The explanation of the logic used goes beyond the scope of this answer, but checking the links, or doing an in-depth research on the above appointments can find the full clarification of the functioning.

"What limitations, what is the best rate that can be expected in practice?"

Nowadays, due to the progress in the production of physical media, the tolerance in some media is close to 0 (optical fiber), and can be conquered in some special laboratories. However the diversity of means, makes us consider for example the radio transmissions that depend on a controlled environment, which is not possible on the day, because rains, winds, topology and other natural details or not, interfere directly in communication. For frames encoded by CCSDS Reed-Solomon code, less than 1 in 40,000 wrong frames can escape detection.

"Is there any point from which the codes become excessively
long (tending to infinity, of course)?"

In theory, the algorithmic theory claims to be a sequence finite of instructions. The algorithms of detection and correction, are relatively short, because they act on parts of defined size (packages). However, this assertion is applied to simple commercial means, and taking into account military cryptography, space probes and other applications, I believe that they can really tend towards "infinity", or a few million lines... Although the use of logic serves exactly to inhibit this idea, I think...

"And these codes correct errors at arbitrary positions in the data (e.g.: beginning, middle, end, 1 bit every 8, any combination of previous, etc.) or only some specific error types?"

They act on predetermined parts of the packets, that is, the arbitrariness in the case is not the beginning, middle or end, but the error detected in itself.

The Advisory Committee for Data Space Systems ( CCSDS ) defines the protocol used for the transmission of data from spacecraft instruments through the deep space channel. According to this pattern, an image or other data sent from a spacecraft instrument is transmitted using one or more packets, ranging from 7 to 65,542 bytes, including the package header. These in turn are transmitted through tables of up to 2048 bytes.

"In which scenarios this is used in practice?"
- Internet
- Telecommunications in deep space
- Satellite broadcasting
- Data storage
- Memory with error correction
"There are libraries ready to do this, or we need to implement on a case-by-case basis (depending on the particularities of each application)?"

The set of codes and protocols used to construct progress directed to these tasks is called Elementary Data Link Protocols. And the protocol code is in turn a set of classes.

Some definitions used in the presented protocols are stored in the file Protocol. h

#define MAX_PKT 1024  // Packet size in bytes
typedef enum {false,true} boolean;
typedef unsigned int seq_nr // seq or ack numbers
typedef struct {unsigned char data[MAX_PKT];} packet
typedef enum {data, ack,nak} frame_kind

typedef struct{
  frame_kind kind;
  seq_nr seq;
  seq_nr ack;
  packet info;
} frame;

void waif_for_event(event_type *event);
void from_network_layer(packet *p);
void to_network_layer(packet *p);
void from_physical_layer(frame *r);
void to_physical_layer(frame *s);
void start_timer(seq_nr k);
void stop_timer(seq_nr k);
void start_ack_timer(void);
void stop_ack_timer(void);
void enable_network_layer(void);
void disalbe_network_layer(void);
insira o código aqui
#define inc(k) if(k<MAX_SEQ) k++; else k=0

Examples of protocols and codes

Simplex protocol for channel with noise
Sliding window protocol using go back n
Selective repeat protocol

Pipelining and error recovery. Effect of an error when (a) the size of the receiving window is 1 and when (b) the size of the receiving window is large

Note on clarification:

In practice these protocols are used embedded in the hardware responsible for the transmission and reception of the data, and are abstracted from layers closer to the end user, so it goes unnoticed.

Special thanks to Tanenbaum