Introduction

The RFC 4648 (The Base16, Base32, and Base64 Data Encodings) defines different methods to encode binary data. Every Unix like system has the tool base64 installed to encode and decode data using the base64 alphabet. This alphabet includes the characters A-Z, a-z, 0-9, +, / for the data and = for padding. The base16 encoding scheme, better known as hex encoding, uses the alphabet 0-9 and A-F. This encoding is case-insensitive. The GNU coreutils do not include a base16 tool. I searched for a hex encoding and decoding tool with the same functionality as base64 without success. That’s why I wrote a script so I can use it to hex encode and decode binary data. Basically, it’s a wrapper around some Perl code.

Base16 Theory

The RFC 4648 (The Base16, Base32, and Base64 Data Encodings) tells us, that 4 bits can be represented by a character of the alphabeth of 0-9 and A-F. This is due to the fact, that 2^4 is 16 and the alpabeth has 16 different characters. Therefore, a byte uses two characters of this alpabeth.

This is an extract of the RFC:

[ 8 T d i A r T a r t t T V 0 1 2 3 U f [ . . h e n e h s i h r a a n u . . e s s 1 p e g e a b l 0 1 2 3 l l . . B c e 6 r o h n n l u i l . ] a f r n - e e u t s e e 4 5 6 7 k ] s o i s c s n t , t l e c e l p i h e c p r a 5 E 4 5 6 7 o l t t a n o u a e t : n b d 1 o i i r t d t n a e c 8 9 1 1 a e 6 w o v a e i t d T o 0 1 s i n e c d n s 8 e h d 8 9 e w E n s t g t - d i e i A B o n g . h e p r b n n 1 1 3 r c e r e p i i a t B g 2 3 1 1 2 d o d E x r r n t s o a 4 5 d e s s o g s V C D a i i s s e u p c s i 2 a e a E F n s n c e n b r e n l d g r n c s i s o p c s 1 u a i t o e n s f u o i 6 e b l p i d t t t n n a w t a i a r 2 c g A E s a i l n o b e i a l l n e y o l g f l p e s t e p c s n y e r n e h o 6 , a U e c t n c a d 4 a i n S c s o a a h b i , v s B d - h e d k t a e n a a A a n e e e r t g n i o s m S r t d n d a o l r e a C a s c V a i y I c c f 4 t a s b g 1 I t 8 h r - e l p l i 6 b e - a o b r u e e n e i r b r m i e c . a e s . i a t i i l n r t c t n E a c e u t h g n l b o f s g e e r t c u d e e r r o h o p t i r d o s i u e d a n r , u . n p i d a g e p p s b n d n d e s P u , a g i a i n r t s n l s t a o e e V g o o b o c d a a g t l c e a c 1 l i o h a i t e t h 6 u s u e s n e d a e s g t i . o a n s " s n f l E e t t b 4 ) g T p n c o a a h w h c e n s b o f e h a o s p d e i f r s i b d s r a 1 t o e c e i a e r 6 s i m h t n r v d " n 8 . g y i t p l i o c o o u e b s s u a r t f i i s s b t t n e " e b s c - h i t e e t o a x s r a " e .

Script

The script base16 can be found on GitHub: https://github.com/emanuelduss/Scripts/blob/master/base16.

On Arch Linux, you can install base16 from the Arch User Repository (AUR):

$ [ . p . a . c ] k e r - n o e d i t - n o c o n f i r m - S b a s e 1 6

Usage

The script base16 offers the same options and functionallty as the base64 tool. The usage can be displayed with the -h option:

$ U B W O - - - U - s a i p d i w s h b a s t t , , e a g e h i C d s e 1 o d w O 0 i e : 6 n n e h L s 1 o s c e S t p 6 b ( : o n o l a H F d w a - s e I e d r d y h e x L e a i 1 ) E d c p s t 6 , a o a h e t d e b i [ n o a i n l s O c r n c e P o g o h T d w , d l e I e h e i l O e i d n p N o n g e ] r n l a . F o i w n . d I r n r d . e L e e a c E s p e [ o n p x F d i o a i i I e s n f n t L - t g E F - a e ] I , l r L p E r h C , e a O a b L o d e S r t s c s t c h t a h a a n a r n d r a d a a c a r c t r d t e d e r i r i n s ( n p d p u e u t f t . a , ` u l t ` t o ` 7 s 6 t ) a . n d a r d o u t p u t .

The data are encoded as described for the base16 alphabet in RFC 4648. When decoding, the input may contain newlines in addition to the bytes of the formal base64 alphabet. Use --ignore-garbage to attempt to recover from any other non-alphabet bytes in the encoded stream.

Examples

For example, you can encode some data. When no file is provided, the data is read from standard input:

$ 4 6 8 9 e 6 6 c 5 e h 6 6 o c 5 6 - c e 6 n f 2 " 0 H 5 e 7 l 6 l f o 7 2 W 6 o c r 6 l 4 d 0 \ a n 5 T 4 h 6 i 8 s 6 9 i 7 s 3 2 a 0 6 t 9 e 7 s 3 t 2 0 w 6 i 1 t 2 h 0 7 a 4 6 n 5 e 7 w 3 l 7 i 4 n 2 e 0 " 7 7 | 6 9 b 7 a 4 s 6 e 8 1 2 6 0 6 | 1 2 t 0 e 6 e e 6 / 5 t 7 m 7 p 6 / c t e s t . o u t

If you don’t like the linebreaks, use the -w 0 options do disable it:

$ 4 8 e 6 c 5 h 6 o c 6 - c e 6 n f 2 " 0 H 5 e 7 l 6 l f o 7 2 W 6 o c r 6 l 4 d 0 \ a n 5 T 4 h 6 i 8 s 6 9 i 7 s 3 2 a 0 6 t 9 e 7 s 3 t 2 0 w 6 i 1 t 2 h 0 7 a 4 6 n 5 e 7 w 3 l 7 i 4 n 2 e 0 " 7 7 | 6 9 b 7 a 4 s 6 e 8 1 2 6 0 6 1 - 2 w 0 6 0 e 6 5 7 7 6 c 6 9 6 e 6 5

This can then be decoded again using base16:

$ H T e h b l i a l s s o e i 1 W s 6 o r a - l d d t e / s t t m p w / i t t e h s t a . o n u e t w l i n e

Update 2020

I learned that the tool basenc from the GNU coreutils is able to convert hex/base16:

$ echo -en "Hello World\nThis is a test with a newline" | basenc --base16 -w 0
48656C6C6F20576F726C640A546869732069732061207465737420776974682061206E65776C696E65

$ basenc --decode --base16 --ignore-garbage <<< 48656C6C6F20576F726C640A546869732069732061207465737420776974682061206E65776C696E65
Hello World
This is a test with a newline

This is much easier, because it does not require any additional scripts. The base16 script is therefore not necessary anymore ;-).

If you use it a lot, you can add the following alias into your .bashrc to save some keystrokes:

alias base16="basenc --base16"

References