Tool/Script to encode and decode base16 (Hex) data

The RFC 4648 (The Base16, Base32, and Base64 Data Encodings) defines different methods to encode binary data. Every Unix like system has the tool base64 installed to encode and decode data using the base64 alphabet. This alphabet includes the characters A-Z, a-z, 0-9, +, / for the data and = for padding. The base16 encoding scheme, better known as hex encoding, uses the alphabet 0-9 and A-F. This encoding is case-insensitive. The GNU coreutils do not include a base16 tool. I searched for a hex encoding and decoding tool with the same functionality as base64 without success. That’s why I wrote a script so I can use it to hex encode and decode binary data. Basically, it’s a wrapper around some Perl code.

Theory

The RFC 4648 (The Base16, Base32, and Base64 Data Encodings) tells us, that 4 bits can be represented by a character of the alphabeth of 0-9 and A-F. This is due to the fact, that 2^4 is 16 and the alpabeth has 16 different characters. Therefore, a byte uses two characters of this alpabeth.

This is an extract of the RFC:

[...]
<code>

8. Base 16 Encoding

<code>

The following description is original but analogous to previous
descriptions. Essentially, Base 16 encoding is the standard case-
insensitive hex encoding and may be referred to as “base16” or “hex”.

<code>

A 16-character subset of US-ASCII is used, enabling 4 bits to be
represented per printable character.

<code>

The encoding process represents 8-bit groups (octets) of input bits
as output strings of 2 encoded characters. Proceeding from left to
right, an 8-bit input is taken from the input data. These 8 bits are
then treated as 2 concatenated 4-bit groups, each of which is
translated into a single character in the base 16 alphabet.

<code>

Each 4-bit group is used as an index into an array of 16 printable
characters. The character referenced by the index is placed in the
output string.

<code>

[…]
Table 5: The Base 16 Alphabet

<code>

Value Encoding Value Encoding Value Encoding Value Encoding
0 0 4 4 8 8 12 C
1 1 5 5 9 9 13 D
2 2 6 6 10 A 14 E
3 3 7 7 11 B 15 F

<code>

Unlike base 32 and base 64, no special padding is necessary since a
full code word is always available.

 
[...]

Installation

The script base16 can be found on GitHub: https://github.com/emanuelduss/Scripts/blob/master/base16.

You can do a quick and dirty install using the following commands:

$ curl -L https://raw.githubusercontent.com/mindfuckup/Scripts/master/base16 | sudo tee /usr/local/bin/base16 &gt;/dev/null
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
Dload  Upload   Total   Spent    Left  Speed
100  2574  100  2574    0     0  11817      0 --:--:-- --:--:-- --:--:-- 11861
$ sudo chmod 755 /usr/local/bin/base16

On Arch Linux, you can install base16 from the Arch User Repository (AUR):

$ packer --noedit --noconfirm -S base16
[...]

Usage

The script base16 offers the same options and functionallty as the base64 tool. The usage can be displayed with the -h option:

$ base16 -h
Usage: base16 [OPTION]... [FILE]
Base16 (Hex) encode or decode FILE, or standard input, to standard output.
With no FILE, or when FILE is -, read standard input.
<code>

Options:
-d, decode data
-i, when decoding, ignore non-alphabet characters
-w COLS wrap encoded lines after COLS character (default 76).
Use 0 to disable line wrapping

<code>

-h display this help and exit

 
The data are encoded as described for the base16 alphabet in RFC 4648.
When decoding, the input may contain newlines in addition to the bytes of
the formal base64 alphabet.  Use --ignore-garbage to attempt to recover
from any other non-alphabet bytes in the encoded stream.

Examples

For example, you can encode some data. When no file is provided, the data is read from standard input:

$ echo -en "Hello World\nThis is a test with a newline" | base16 | tee /tmp/test.out
48656c6c6f20576f726c640a546869732069732061207465737420776974682061206e65776c
696e65

If you don’t like the linebreaks, use the -w 0 options do disable it:

$ echo -en "Hello World\nThis is a test with a newline" | base16  -w 0
48656c6c6f20576f726c640a546869732069732061207465737420776974682061206e65776c696e65

This can then be decoded again using base16:

$ base16 -d /tmp/test.out
Hello World
This is a test with a newline

References

Leave a Comment