{
	"id": "ffd85561-d4a2-48fd-a449-3eea69bec222",
	"created_at": "2026-04-06T00:14:42.635385Z",
	"updated_at": "2026-04-10T03:24:29.433522Z",
	"deleted_at": null,
	"sha1_hash": "0cdc744d04a327710ffc992d2f9079678c869a1f",
	"title": "Binary-to-text encoding",
	"llm_title": "",
	"authors": "",
	"file_creation_date": "0001-01-01T00:00:00Z",
	"file_modification_date": "0001-01-01T00:00:00Z",
	"file_size": 228328,
	"plain_text": "Binary-to-text encoding\r\nBy Contributors to Wikimedia projects\r\nPublished: 2005-11-30 · Archived: 2026-04-05 13:00:06 UTC\r\nFrom Wikipedia, the free encyclopedia\r\nA binary-to-text encoding is a data encoding scheme that represents binary data as plain text. Generally, the\r\nbinary data consists of a sequence of arbitrary 8-bit byte (a.k.a. octet) values and the text is restricted to the\r\nprintable character codes of commonly-used character encodings such as ASCII. In general, arbitrary binary data\r\ncontains values that are not printable character codes, so software designed to only handle text fails to process\r\nsuch data. Encoding binary data as text allows information that is not inherently stored as text to be processed by\r\nsoftware that otherwise cannot process arbitrary binary data. The software cannot interpret the information, but it\r\ncan perform useful operations on the data such as transmit and store.\r\nPGP documentation (RFC 9580) uses the term \"ASCII armor\" for binary-to-text encoding when referring to\r\nBase64.\r\nConceptually, binary-to-text encoding differs from numeric representation for a numeric base (radix). For\r\nexample, decimal is a scheme for representing a value as base-10, but it is not a binary-to-text encoding. A binary-to-text encoding could be devised that uses decimal representation for encoded data, but such a system would use\r\nonly 10 values of a 4-bit encoded sequence, leaving 6 values unused. A more efficient encoding would use all 16\r\nvalues. This is Base16 which uses hexadecimal for encoding each 4-bit sequence. Notably, because 16 is a power\r\nof two, Base16 and hexadecimal are indistinguishable in practice even though they differ conceptually.\r\nEscape encodings such as percent-encoding and quoted-printable also allow for representing arbitrary binary data\r\nas text, but in a significantly different way. A binary-to-text encoding involves encoding an entire input sequence\r\nwhereas an escape encoding allows for embedding binary data in data that is already and inherently text.\r\nTransmitting binary data as text\r\n[edit]\r\nA binary-to-text encoding enables transmitting data on a communication channel that does not allow arbitrary\r\nbinary data (such as email or NNTP) or is not 8-bit clean. The encoding enables transmitting binary data over a\r\ncommunications protocol that is designed to carry human-readable (i.e. English language) text. Often such a\r\nprotocol only supports 7-bit character values (and within that avoids certain control codes), and may require line\r\nbreaks at certain maximum intervals, and may not maintain whitespace. Thus, only the 94 printable ASCII\r\ncharacters are safe to use to convey data.\r\nThe ASCII text-encoding standard uses 7 bits to encode characters. With this it is possible to encode 128 (i.e. 27)\r\nunique values (0–127) to represent the alphabetic, numeric, and punctuation characters commonly used in English,\r\nplus a selection of non-printable control characters. For example, the capital letter A is represented as 65 (4116,\r\nhttps://en.wikipedia.org/wiki/Binary-to-text_encoding\r\nPage 1 of 7\n\n100 00012\r\n), the numeral 2 is 50 (3216, 011 00102), the right curly brace } is 125 (7D16, 111 11012), and the\r\ncarriage return control character CR is 13 (0D16, 000 11012).\r\nIn contrast, most computers store data in memory organized in eight-bit bytes (a.k.a. octets). Files that contain\r\nmachine-executable code and non-textual data typically contain all 256 possible eight-bit byte values. Many\r\ncomputer programs came to rely on this distinction between seven-bit text and eight-bit binary data, and would\r\nnot function properly if non-ASCII characters appeared in data that was expected to include only ASCII text. For\r\nexample, if the value of the eighth bit is not preserved, the program might interpret a byte value above 127 as a\r\nflag telling it to perform some function.\r\nIt is often desired to send non-textual data through a text-based system, such as attaching an image to an e-mail\r\nmessage. To accomplish this, the data is encoded in some way, such that 8-bit data is encoded as 7-bit ASCII\r\ncharacters (generally using only alphanumeric and punctuation characters—the ASCII printable characters). Upon\r\narrival at its destination, it is then decoded back to its 8-bit form. This process is referred to as binary to text\r\nencoding. Many programs perform this conversion to allow for data-transport, such as PGP and GNU Privacy\r\nGuard.\r\nEncoding plain text\r\n[edit]\r\nBinary-to-text encoding methods are also used as a mechanism for encoding plain text. Some systems have a more\r\nlimited character set they can handle; not only are they not 8-bit clean, some cannot even handle every printable\r\nASCII character. Other systems have limits on the number of characters that may appear between line breaks, such\r\nas the \"1000 characters per line\" limit of some Simple Mail Transfer Protocol software, as allowed by RFC 2821.\r\nStill others add headers or trailers to the text. A few poorly-regarded but still-used protocols use in-band signaling,\r\ncausing confusion if specific patterns appear in the message. The best-known is the string \"From \" (including\r\ntrailing space) at the beginning of a line, used to separate mail messages in the mbox file format.\r\nBy using a binary-to-text encoding on messages that are already plain text, then decoding on the other end, one\r\ncan make such systems appear to be completely transparent. This is sometimes referred to as 'ASCII armoring'.\r\nFor example, the ViewState component of ASP.NET uses base64 encoding to safely transmit text via HTTP POST,\r\nin order to avoid delimiter collision.\r\nThe table below describes notable binary-to-text encodings. The efficiency listed is the ratio between the number\r\nof bits in the input and the number of bits in the encoded output. For any encoding that maps n input possibilities\r\ninto one 8-bit character, the efficiency is log2(n)/8.\r\nEncoding Efficiency\r\nProgramming\r\nlanguage\r\nimplementations\r\nComments\r\nAscii85 80% awk Archived 2014-\r\n12-29 at the Wayback\r\nMachine, C, C (2), C#,\r\nThere exist several variants of this\r\nencoding, Base85, btoa, etc.\r\nhttps://en.wikipedia.org/wiki/Binary-to-text_encoding\r\nPage 2 of 7\n\nEncoding Efficiency\r\nProgramming\r\nlanguage\r\nimplementations\r\nComments\r\nF#, Go, Java Perl,\r\nPython, Python (2)\r\nBase16 50% Most languages\r\nAs it's based on hexadecimal, there are\r\nvariants for upper, lower or either case\r\nBase26 58.3%\r\nUsed to convert SHA-256 hash into all-uppercase strings in InChIKey (a standard\r\nindexing system of chemical structures)[1]\r\nand SID (sequence identification, an\r\nindexing system of PCR amplicons in\r\nforensics).[2] InChIKey specifically uses\r\ntwo kinds of mappings: 14b:3ch, 9b:2ch.\r\nBase32 62.5%\r\nANSI C, Delphi, Go,\r\nJava, C# F#, Python\r\n \r\nBase36 64.6%\r\nbash, C, C++, C#,\r\nJava, Perl, PHP,\r\nPython, Visual Basic,\r\nSwift, many others\r\nUses only numerals (0–9) and lowercase\r\nletters (a–z). Commonly used by URL\r\nredirection systems like TinyURL or\r\nSnipURL/Snipr as compact alphanumeric\r\nidentifiers.\r\nBase45 68.6% (97%[a]\r\n) Go, Python\r\nDefined in IETF Specification RFC 9285\r\nfor including binary data compactly in a\r\nQR code.\r\n[3]\r\nBase56 72.6% PHP, Python, Go\r\nLike Base58 but further excludes\r\ncharacters 1 and lowercase-O ( o ) in\r\norder to minimise the risk of fraud and\r\nhuman-error.\r\n[4]\r\nBase58 73.2% C, C++, Python, C#,\r\nJava\r\nLike Base64 but excludes non-alphanumeric characters ( + and / ) and\r\npairs of characters that often look\r\nambiguous when rendered: zero ( 0 ) and\r\ncapital-O ( O ), and capital-I ( I ) and\r\nlowercase-L ( l ). Base58 is used to\r\nrepresent bitcoin addresses.[citation needed]\r\nFor SegWit, it was replaced by Bech32.\r\nhttps://en.wikipedia.org/wiki/Binary-to-text_encoding\r\nPage 3 of 7\n\nEncoding Efficiency\r\nProgramming\r\nlanguage\r\nimplementations\r\nComments\r\nBase58 in the original\r\nbitcoin source code\r\nBase62 74.4% Rust, Python\r\nLike Base64 but contains only\r\nalphanumeric characters.\r\nBase64 75.0%\r\nawk Archived 2014-\r\n12-29 at the Wayback\r\nMachine, C, C (2),\r\nDelphi, Go, Python,\r\nmany others\r\nAn early and still-popular encoding, first\r\nspecified as part of RFC 989 in 1987\r\nBase85 80.0% C, Python, Python (2) Revised version of Ascii85.\r\nBaseXML[5] 80% ± 6 chars C Python JavaScript Encoding for stuffing data in XML.\r\nBase91[6] 81.3% C# F# Constant width variant\r\nbasE91[7] 81.3%\r\nC, Java, PHP, 8086\r\nAssembly, AWK C#,\r\nF#, Rust\r\nVariable width variant\r\nBase94[8] 81.9% Python, C, Rust  \r\nBase122[9] 87.5%\r\nJavaScript, Python,\r\nJava, Base125 Python\r\nand Javascript, Go, C\r\nEncodes to UTF-8, hence a different\r\nefficiency claim from theoretical.\r\nBech32\r\n62.5% - at least 8\r\nchars overhead\r\n(label, separator, 6-\r\nchar ECC)\r\nC, C++, JavaScript,\r\nGo, Python, Haskell,\r\nRuby, Rust\r\nSpecification.[10] Used in Bitcoin and the\r\nLightning Network.\r\n[11]\r\n The data portion is\r\nencoded like Base32 with the possibility to\r\ncheck and correct up to 6 mistyped\r\ncharacters using the 6-character BCH code\r\nat the end, which also checks/corrects the\r\nHuman Readable Part. The Bech32m\r\nvariant has a subtle change that makes it\r\nmore resilient to changes in length.[12]\r\nBinHex 75% Perl, C, C (2) MacOS Classic\r\nhttps://en.wikipedia.org/wiki/Binary-to-text_encoding\r\nPage 4 of 7\n\nEncoding Efficiency\r\nProgramming\r\nlanguage\r\nimplementations\r\nComments\r\nIntel HEX ≲50% C library, C++\r\nTypically used to program EPROM, NOR\r\nflash memory chips\r\nMIME\r\nSee Quoted-printable and\r\nBase64\r\nSee Quoted-printable\r\nand Base64\r\nEncoding container for e-mail-like\r\nformatting\r\nS-record\r\n(Motorola\r\nhex)\r\n49.6% C library, C++\r\nTypically used to program EPROM, NOR\r\nflash memory chips. 49.6% assumes 255\r\nbinary bytes per record.\r\nTektronix hex ≲50%\r\nTypically used to program EPROM, NOR\r\nflash memory chips.\r\nTxMS Variable TypeScript, CLI, Dart\r\nTxMS compresses binary data into a\r\nreadable text format using Binary-to-Text\r\nencoding and allows reversible conversion\r\nback to hexadecimal.\r\nuuencoding ~60% (up to 70%)\r\nPerl, C, Delphi, Java,\r\nPython, probably\r\nmany others\r\nAn early encoding developed in 1980 for\r\nUnix-to-Unix Copy. Largely replaced by\r\nMIME and yEnc\r\nxxencoding\r\n~75% (similar to\r\nUuencoding)\r\nC, Delphi\r\nProposed (and occasionally used) as\r\nreplacement for Uuencoding to avoid\r\ncharacter set translation problems between\r\nASCII and the EBCDIC systems that\r\ncould corrupt Uuencoded data\r\nz85 (ZeroMQ\r\nspec:32/Z85)\r\n80% (similar to\r\nAscii85/Base85)\r\nC (original), C#, Dart,\r\nErlang, Go, Lua,\r\nRuby, Rust and others\r\nSpecifies a subset of ASCII similar to\r\nAscii85, omitting a few characters that\r\nmay cause program bugs ( ` \\ \" ' _\r\n,; ). The format conforms to ZeroMQ\r\nspec:32/Z85.\r\nRFC 1751\r\n(S/KEY[13])\r\n33% C,Python\r\n\"A Convention for Human-readable 128-\r\nbit Keys\". A series of small English words\r\nis easier for humans to read, remember,\r\nand type in than decimal or other binary-to-text encoding systems.[14]\r\n Each 64-bit\r\nnumber is mapped to six short words, of\r\nhttps://en.wikipedia.org/wiki/Binary-to-text_encoding\r\nPage 5 of 7\n\nEncoding Efficiency\r\nProgramming\r\nlanguage\r\nimplementations\r\nComments\r\none to four characters each, from a public\r\n2048-word dictionary.\r\n[13]\r\nSome older and today uncommon formats include BOO (a base64),[15] BTOA (vaguely-defined \"binary to ascii\",\r\nhistorically base85, today in JavaScript base64), and USR encoding.\r\nBase64 (with many variants including uuencoding) maps sequences of 6 bits to printable characters. Since there\r\nare more than 26 = 64 printable characters, this is possible. A given sequence of bytes is translated by viewing it as\r\na stream of bits, breaking this stream into chunks of 6 bits and generating the sequence of corresponding\r\ncharacters. The different encodings differ in the mapping between sequences of bits and characters and in how the\r\nresulting text is formatted.\r\nSome encodings (the original version of BinHex and the recommended encoding for CipherSaber) use four bits\r\ninstead of six, mapping all possible sequences of 4 bits onto the 16 standard hexadecimal digits. Using 4 bits per\r\nencoded character leads to a 50% longer output than base64, but simplifies encoding and decoding—expanding\r\neach byte in the source independently to two encoded bytes is simpler than base64's expanding 3 source bytes to 4\r\nencoded bytes.\r\nOut of PETSCII's first 192 codes, 164 have visible representations when quoted: 5 (white), 17–20 and 28–31\r\n(colors and cursor controls), 32–90 (ascii equivalent), 91–127 (graphics), 129 (orange), 133–140 (function keys),\r\n144–159 (colors and cursor controls), and 160–192 (graphics).[16] This theoretically permits encodings, such as\r\nbase128, between PETSCII-speaking machines.\r\nAlphanumeric shellcode – Code intended as a payload to exploit a software vulnerability\r\nCharacter encoding – Using numbers to represent text characters\r\nComputer number format – Internal representation of numeric values in a digital computer\r\nGeocode – Code that represents a geographic entity (location or object)\r\nNumeral system – Notation for expressing numbers\r\nPunycode – Encoding for Unicode domain names\r\n1. ^ Encoding for QR code generation automatically selects the encoding to match the input character set,\r\nencoding 2 alphanumeric characters in 11 bits, and Base45 encodes 16 bits into 3 such characters. The\r\nefficiency is thus 32 bits of binary data encoded in 33 bits: 97%.\r\n1. ^ \"Technical FAQ - InChI Trust\". inchi-trust.org. Retrieved 2021-01-08.\r\n2. ^ Young, Brian; Faris, Tom; Armogida, Luigi (2019). \"A nomenclature for sequence-based forensic DNA\r\nanalysis\". Genetics. 42. Forensic Science International: 14–20. doi:10.1016/j.fsigen.2019.06.001.\r\nPMID 31207427. “[...] 2) the hexadecimal output of the hash function is converted to hexavigesimal (base-26)”\r\nhttps://en.wikipedia.org/wiki/Binary-to-text_encoding\r\nPage 6 of 7\n\n3. ^ Fältström, Patrik; Ljunggren, Freik; Gulik, Dirk-Willem van (2022-08-11). \"The Base45 Data\r\nEncoding\". “Even in Byte mode, a typical QR code reader tries to interpret a byte sequence as text\r\nencoded in UTF-8 or ISO/IEC 8859-1. ... Such data has to be converted into an appropriate text before\r\nthat text could be encoded as a QR code. ... Base45 ... offers a more compact QR code encoding.”\r\n4. ^ Duggan, Ross (August 18, 2009). \"Base-56 Integer Encoding in PHP\".\r\n5. ^ \"BaseXML - for XML1.0+\". GitHub. 16 March 2019.\r\n6. ^ Dake He; Yu Sun; Zhen Jia; Xiuying Yu; Wei Guo; Wei He; Chao Qi; Xianhui Lu. \"A Proposal of\r\nSubstitute for Base85/64 – Base91\" (PDF). International Institute of Informatics and Systemics.\r\n7. ^ \"binary to ASCII text encoding\". basE91. SourceForge. Retrieved 2023-03-20.\r\n8. ^ \"Convert binary data to a text with the lowest overhead\". Vorakl's notes. April 18, 2020.\r\n9. ^ Albertson, Kevin (Nov 26, 2016). \"Base-122 Encoding\".\r\n10. ^ \"bitcoin/bips\". GitHub. 8 December 2021.\r\n11. ^ Rusty Russell; et al. (2020-10-15). \"Payment encoding in the Lightning RFC repo\". GitHub.\r\n12. ^ \"Bech32m format for v1+ witness addresses\". GitHub. 5 December 2021.\r\n13. ^ Jump up to: a\r\n \r\nb\r\n RFC 1760 \"The S/KEY One-Time Password System\".\r\n14. ^ RFC 1751 \"A Convention for Human-Readable 128-bit Keys\"\r\n15. ^ \"Boo File Format, Encoding and Decoding\". Columbia University.\r\n16. ^ \"Commodore 64 PETSCII codes\". sta.c64.org.\r\nSource: https://en.wikipedia.org/wiki/Binary-to-text_encoding\r\nhttps://en.wikipedia.org/wiki/Binary-to-text_encoding\r\nPage 7 of 7",
	"extraction_quality": 1,
	"language": "EN",
	"sources": [
		"MITRE"
	],
	"references": [
		"https://en.wikipedia.org/wiki/Binary-to-text_encoding"
	],
	"report_names": [
		"Binary-to-text_encoding"
	],
	"threat_actors": [
		{
			"id": "aa73cd6a-868c-4ae4-a5b2-7cb2c5ad1e9d",
			"created_at": "2022-10-25T16:07:24.139848Z",
			"updated_at": "2026-04-10T02:00:04.878798Z",
			"deleted_at": null,
			"main_name": "Safe",
			"aliases": [],
			"source_name": "ETDA:Safe",
			"tools": [
				"DebugView",
				"LZ77",
				"OpenDoc",
				"SafeDisk",
				"TypeConfig",
				"UPXShell",
				"UsbDoc",
				"UsbExe"
			],
			"source_id": "ETDA",
			"reports": null
		}
	],
	"ts_created_at": 1775434482,
	"ts_updated_at": 1775791469,
	"ts_creation_date": 0,
	"ts_modification_date": 0,
	"files": {
		"pdf": "https://archive.orkl.eu/0cdc744d04a327710ffc992d2f9079678c869a1f.pdf",
		"text": "https://archive.orkl.eu/0cdc744d04a327710ffc992d2f9079678c869a1f.txt",
		"img": "https://archive.orkl.eu/0cdc744d04a327710ffc992d2f9079678c869a1f.jpg"
	}
}