What’s the encoding for in Base64?

Asked

Viewed 9,801 times

14

Virtually every self-respecting programming language has its implementation of encoding and Decoding from a string to a string in Base64 characters.

But what is the Base64 itself for?

Thank you!

3 answers

18


Base64 is a method for coding data for Internet transfer (MIME encoding for content transfer). It is often used to transmit binary data by transmission means that only deal with text, for example to send attachments by email.

It consists of 64 characters ([A-Za-Z0-9], "/" and "+") which gave rise to its name. The character "=" is used as a special suffix and the original specification (RFC 989) defined that the symbol "*" can be used to delimit converted but unencrypted data within a stream.

Coding example:

Texto original: hello world
Texto convertido para Base64: aGVsbG8gd29ybGQK

Base64 encoding is often used when there is a need to transfer and store binary data to a device designed to work with textual data. This encoding is widely used by applications in conjunction with the XML markup language, making it possible to store binary data in text form.

Source

13

Sometimes you want to transfer some data in binary and you cannot do this transfer because some media are made to text Streamer.

As an example you have the following data representation in an array:

nome = "Joao"
idade = 20

You can make this data transfer using the text form of this data, such as JSON

{"nome":"joao", "idade":20}

In the case of data binários you can’t just take the value itself of them and make this representation of text, then enter the Base64.

To get around this situation people encode their binary data in Base64 so they can make this representation of text for any kind of transfer and use.

There are several other encoders that can be used, but the most common is Base64

11

The US-ASCII character set has 95 "printable" characters, plus 33 other control characters (0 to 31 and 127, or 00-1F and 7F in hexadecimal), originally used to control devices such as printers, etc. The most "universal" encoding in existence (virtually all other, including Unicode or "code pages" from Windows, are superconped this), an ASCII text sent from an origin will probably be well accepted at any destination (and intermediaries) without data corruption. When a data (text or binary) cannot be expressed in ASCII without modifications, it is sometimes desirable to encode it in an ASCII text before sending it, decoding it again when arriving at its destination.

The highest power of 2 less than 95 is 64. At first one could try to encode data in the same 95 basis, but this is complicated and often inefficient. The advantage of base 64 is that every 3 bytes (3*8 = 24 bits) of the input results in exactly 4 characters of the output (4*6 = 24 bits), so you can do the conversion from and to binary with a constant use of memory and quite simple operations.

Base64 encoding uses all uppercase and lowercase letters and all digits, in total of 62 characters, plus two others chosen case by case, but traditionally being the + and the /. The = is also widely used to delimit the end of the data (when its size is not multiple of 3), it is common to see a (=) or two (==) of this symbol at the end of strings in Base64. A less common alternative - when you don’t want to use anything other than letters and numbers, or distinguish between upper and lower case - is base 32, which uses all letters plus the numbers 0 to 5.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.