Text Encoding Converter

Convert text between Hex, Binary, Unicode and more encoding formats

Input Format

Output Format

Conversion Options

Byte Delimiter:

Add Prefix:

Uppercase Output

Input

Output

Quick Convert

Encoding Converter Documentation

What is Character Encoding?

Character encoding is a system that maps characters to numbers that computers can process. Different encoding schemes are used for different purposes such as storing, transmitting, or displaying text data. Common encodings include ASCII, UTF-8, UTF-16, etc.

Supported Formats

Hexadecimal (Hex)

Hexadecimal representation using digits 0-9 and letters A-F. Each byte is represented by two hex characters. Widely used in programming and debugging.

Binary

Binary representation using only 0 and 1. Each byte is represented by 8 bits. This is the fundamental data representation used by computers.

Unicode Escape

Unicode escape sequences in \uXXXX format, commonly used in programming languages like JavaScript and JSON to represent Unicode characters.

HTML Entity

HTML entity encoding, including named entities (like &) and numeric entities (like & or &). Used to safely display special characters in HTML.

Punycode

Encoding scheme for Internationalized Domain Names (IDN). Converts Unicode characters to ASCII-compatible encoding, prefixed with xn--.

Common Use Cases

View hexadecimal or binary representation of characters during debugging
Handle data encoding in network protocols
Analyze and fix encoding issues (mojibake)
Use Unicode escape sequences in code
Handle Internationalized Domain Names (IDN)
Character escaping in HTML/XML

Character Set vs Encoding Format

Character set and encoding format are two different concepts. A character set defines which characters are used in text (such as ASCII, GB2312, GBK, Unicode, etc.), while an encoding format defines how these characters are stored in a computer (such as UTF-8, UTF-16, etc.). For example, the same text can be encoded using GB2312 character set as GB2312 encoding, or using Unicode character set as UTF-8.

If you need to convert between different character sets (such as GBK, UTF-8, ISO-8859-1, etc.) rather than just changing the encoding format, please use the Character Set Converter tool.

Go to Character Set Converter

Frequently Asked Questions

What is the difference between encoding and encryption?

Encoding transforms data into another representation using a publicly known scheme — there is no secret key involved, and the process is fully reversible by anyone. Encryption scrambles data using a secret key, so only someone with the key can reverse it. Base64 and hex are encodings; AES and RSA are encryption algorithms.

Why does Base64-encoded text end with = or ==?

Base64 encodes every 3 input bytes into 4 output characters. When the input length is not divisible by 3, one or two = characters are added as padding to make the output a multiple of 4 characters. One = means 1 padding byte was added; two == means 2 were added. Some implementations omit padding — both forms are valid if the decoder handles it.

What is the difference between ASCII and Unicode?

ASCII is a 7-bit encoding that covers 128 characters (A-Z, 0-9, common punctuation). Unicode is a character repertoire standard covering over 140,000 characters across all writing systems. UTF-8, UTF-16, and UTF-32 are different ways to encode Unicode code points as bytes — UTF-8 is backward-compatible with ASCII for the first 128 code points.

When should I use hex encoding instead of Base64?

Hex (Base16) represents each byte as two hexadecimal characters, which is verbose but instantly human-readable for technical inspection — handy for debugging byte streams, cryptographic keys, and binary protocol values. Base64 is roughly 33% more compact and is preferred when transmitting binary data in JSON, email, or URLs.

What does the Unicode code point U+XXXX notation mean?

U+XXXX is the standard notation for a Unicode code point, where XXXX is a hexadecimal number. For example, U+0041 is the Latin capital letter A, and U+4E2D is the Chinese character 中. Code points range from U+0000 to U+10FFFF. The U+ prefix was introduced by the Unicode Consortium to distinguish code points from byte values.

Related Tools

Charset Converter

Convert text encoding between UTF-8, GBK, Big5, Shift_JIS, ISO-8859, Windows codepages with auto-detection

Base Converter

Convert between binary, octal, decimal, and hexadecimal number systems with custom base support (2-36)

URL Encoder/Decoder

Encode and decode URLs to ensure compliance and usability

HTML Encoder/Decoder

Convert special characters to HTML entities with named, decimal, and hexadecimal formats to prevent XSS attacks

Base64 Encoder/Decoder

Quickly encode and decode Base64 strings, supporting both text and file conversion

Escape/Unescape Tool

Escape and unescape strings between multiple formats including JavaScript, JSON, HTML, XML, CSV, SQL and more

Text Encoding Converter

Character Details

Encoding Converter Documentation

What is Character Encoding?

Supported Formats

Hexadecimal (Hex)

Binary

Unicode Escape

HTML Entity

Punycode

Common Use Cases

Character Set vs Encoding Format

Frequently Asked Questions

Related Tools

Charset Converter

Base Converter

URL Encoder/Decoder

HTML Encoder/Decoder

Base64 Encoder/Decoder

Escape/Unescape Tool

Quick Menu

Text Encoding Converter

Character Details

Encoding Converter Documentation

What is Character Encoding?

Supported Formats

Hexadecimal (Hex)

Binary

Unicode Escape

HTML Entity

Punycode

Common Use Cases

Character Set vs Encoding Format

Frequently Asked Questions

Related Tools

Charset Converter

Base Converter

URL Encoder/Decoder

HTML Encoder/Decoder

Base64 Encoder/Decoder

Escape/Unescape Tool

Quick Menu

Cookie Settings

Necessary Cookies

Analytics Cookies