Skip to content

Exonum Serialization Format

Binary serialization format is used in Exonum for communication among full nodes, cryptographic operations on light clients and storage of data. The format design provides several important properties, including resilience to maliciously crafted messages, zero-copy deserialization and canonicity.

Serialization in Exonum differs from serialization in the usual sense, since there is no process of transforming the structure into binary data. The data is created already "serialized" and Exonum works directly with the serialized data "deserializing" the fields to which it refers, if necessary.


Communication Among Full Nodes

Full nodes can both serialize messages for sending and deserialize messages when they are received. All the information that passes in the network between nodes turns into messages (the transactions! macro). Data received as a message is validated against serialization rules.

Communication with Light Clients

Light clients can only serialize messages due to the complexity of the checks necessary for the deserialization process. Transactions are created on the client side and sent to the validators in order to be committed into the blockchain. The client sends the transaction in the JSON format along with a signature over the binary serialization of the transaction. Similarly, when the client receives data from a full node, the client serializes the data received in the JSON format and verifies the signature against the binary serialization of the data.

Storage of Data

The storage is used to place blocks, configurations, data specific for services. Serialization is implemented by encoding_struct! macro). Data obtained from the storage is not validated, since it is assumed to be validated earlier.


Data Representation Requirements in Exonum

  • Unambiguity (one can uniquely restore the original data based on serialized data)
    It is required that the verification of the transaction structure on all nodes is identical, hence the format should be binary.

  • Canonicity
    The data set must be presented in only one way. Required for uniqueness of the hash of data.

  • Schema-based verification
    It should be possible to set the data schema and check the message for compliance with the schema (this allows to check the received message before reading its content). The schema should not allow the presence of optional fields. In the Exonum serialization format the schema is stored separately from the serializable data.

  • All-or-nothing approach to correctness
    Reading the fields does not happen until the validation is complete. Validation on message reading can not be lazy: first check the entire message to the end, then read completely without checking.

  • Tolerance to malicious messages
    The node must not fail on receiving a message violating the serialization rules.

  • Single format for storage and wire transfer
    Nodes forward almost all received messages unchanged. Storage of data in the same form allows node not to waste time on the message re-creating. In this case, deserialization from the wire format after receiving a message consists in verifying the correctness of all fields. This requirement allows to achieve zero-copy.

  • Unambiguity of conversion to JSON and back
    The binary format and JSON (used to communicate with light clients) must have the same data schema. This requirement provides the ability of light clients to verify cryptographically signed messages.

  • Balance between access speed and data compactness
    The Exonum serialization format contains a trade-off with the speed of work: segment pointers are not necessary but used for quick access to fields.

  • Identity of serialization on all architectures / platforms
    The little-endian is always used in the Exonum so that reading and writing on modern platforms are direct.

Alternative Serialization Formats

The existing serialization formats do not satisfy requirements on serialization, that's why Exonum uses a custom format.

Serialization Principles

Serialization in Exonum is based on the datatype specifications. Each serializable datatype has its (de)serialization rules, which govern how the instances of this type are (de)serialized from/to a binary buffer. In most cases, these rules are implicitly inferred from the datatype declaration (e.g., via the aforementioned encoding_struct! macro).

The serialization format uses segments and segment pointers to serialize data which size is unknown in compile time (i.e., doesn't follow from the type specification):

  • Segment is a continuous subarray of the serialization buffer, which stores serialization of a certain serializable type instance
  • Segment pointer is a pair of two unsigned integers: a 0-based starting position of a segment relative to the beginning of the entire serialization buffer, and the byte size of the segment (or the number of elements in the slice in the case a slice is being serialized within the segment)

The segment pointer mechanism is slightly similar to the concept of heap in memory management. Similarly to dynamically allocated memory, datatype serialization procedures may use segments to allocate space for variable-length data, and point to these segments using segment pointers.

Segment Validation Rules

  • Segments must not overlap
  • There must be no gaps between the segments allocated within the same datatype
  • There must be no space in the serialization buffer that does not correspond to any data of the serialized object. In particular, there must be no gaps before or after segments allocated for a certain datatype
  • Segment pointers must not refer to the memory before themselves (this guarantees the absence of loops)
  • The segment pointers must not point outside the buffer
  • Segments must be placed in a specific order determined by the datatype performing segment allocation

Fixed-length and Var-length Types

The way a particular data type is serialized within a complex type (e.g., a structure) depends on whether the instances of this type may exhibit variable byte length of their serialization. These kinds of types are referred to as fixed-length and var-length, respectively.

  • Integer types and booleans are fixed-length. For example, all u32 instances take 4 bytes to serialize.
  • Strings are var-length. "Hello world!" takes 12 bytes to serialize, and "👍" takes 4 bytes.
  • The rules determining whether an aggregate type is fixed-length are described in the corresponding sections below.
  • Custom type can be fixed-length if its data size is known in advance (can be computed at the compilation stage), or var-length otherwise.

Primitive Types

Integer Types

u8, i8, u16, i16, u32, i32, u64, i64
Correspond to the same Rust language primitive types. Their size is the same as for correspond Rust types and they are stored in little endian.


0x01 for true, 0x00 for false. A message with other value stored in place of bool will not pass validation. Size: 1 byte.

Floating Point Types

f32, f64
Correspond to floating point types in Rust. The types are stored in little endian per binary32 and binary64 formats of the IEEE 754 standard. Infinities, not-a-number (NaN) values, denormalized values and the negative zero are not supported; they cannot be serialized or deserialized.


Support of serialization of floating point types is hidden behind the float_serialize feature gate and is disabled by default. To enable the feature, specify the exonum dependency in the Cargo.toml file of your project as

# Other dependencies...
exonum = { version = "0.5.0", features = [ "float_serialize" ] }

(The version of the Exonum library may differ.)


Use of floating-point arithmetic may lead to hard-to-trace errors and loss of consensus among the nodes in the blockchain due to non-deterministic character of some floating-point operations. Consider using fixed-point arithmetic whenever possible.

Aggregate Types

Byte Buffers

The data of the following fixed-length types is stored in the same way as defined by the underlying byte buffer, without any modifications.

  • Hash
    SHA-256 hash. Size: 32 bytes.

  • PublicKey
    Ed25519 public key. Size: 32 bytes.

  • Signature
    Ed25519 signature. Size: 64 bytes.


Strings are stored in UTF-8 encoding, which may represent a single char with 1 to 4 bytes. String is a var-length type.

Segment Pointers

Segment pointers take 8 bytes:

  • 4 bytes for the position of the corresponding segment (counted from the beginning of the entire serialization buffer)
  • 4 bytes for the number of elements

Both the position and byte size are serialized as little-endian unsigned integers (i.e., in the same way as u32). Hence, segment pointer can be viewed as a structure with two u32 fields.


A structure is representation of struct in Rust. It is data structure with a fixed number of possibly heterogeneous, ordered, named fields.

In binary representation structure is split into two main parts (which are adjacent to each other for each serialized structure):

  • Header is a fixed sized part.
  • Body is a dynamic sized part; it can be read only after parsing the header.

Fixed-length fields are stored completely in the header. Var-length fields are allocated as segments in the body, plus take 8 bytes for the serialized segment pointer, as described above.

Thus, a segment pointer in the header (the position of which is known in compile time) points to the segment in the body, which contains the actual serialization of the field. Segments are placed in the correspondence with the validation rules.


A field of structure type is always handled as var-length field (even if all its fields are fixed-length): segment pointer in the header of the outer structure points to the segment with the inner structure itself.


Consider a structure containing PublicKey, u64 and bool fields. In the binary format all fields of such structure are placed in the header, its body is empty. So such a structure is fixed-length.

Structure composition

encoding_struct! macro allows to compose existing structures:

encoding_struct! {
    struct Pair {
        first: u32,
        second: u32,

encoding_struct! {
    struct Pairs {
        inner: Vec<Pair>,

Here Pair is used as a base type in Pairs declaration.

Analogously, fields derived from structures defined with encoding_struct! macro can be used within messages:

transactions! {
    Transactions {
        const SERVICE_ID = 777;

        struct MessagePairs {
            inner: Vec<Pair>,


A slice is a data structure consisting of an arbitrary number of same type elements. A slice is stored so that the position of each element can be computed from its index. Slice elements are located in memory without gaps in the order of increasing their indices.

Slices like structures have header and body. Each element takes 8 bytes in the header for a corresponding segment pointer. If slice consists of fixed-length elements (except structures), then its body contains elements themselves. If slice consists of var-length elements (or fixed-length structures), the body of such a slice contains segment pointers to the elements of the slice, and elements themselves are located further in memory as segments as per the validation rules.

Number of the slice elements is specified in the header of structure containing the slice.

All slices are var-length datatypes.


In the current implementation, a slice of borrowed type elements can not be used for serialization/deserialization because of missing deserialize implementation for borrowed types. For example slice of &str can not be serialized/deserialized. This is planned to be fixed in future.

Message Serialization

A message is a digitally signed piece of data transmitted through an Exonum network. There are 2 major kinds of messages:

The message serialization consists of 3 main parts: header (includes protocol_version, service_id, message_id, and payload_length fields), body, and signature.

Fields used in message serialization are listed below.

Field Binary format Binary offset JSON
(padding) u8 0 -
protocol_version u8 1 number
service_id u16 4..6 number
message_id u16 2..4 number
payload_length u32 6..10 -
body &[u8] 10..-64 object
signature Ed25519 signature -64.. hex string


For the binary format, the table uses the type notation taken from Rust. Offsets also correspond to the slicing syntax, with the exception that Rust does not support negative offsets, which denote an offset relative to the end of the byte buffer.


The first byte of the message is used as padding. It must be equal to 0.

Protocol Version

The major version of the Exonum serialization protocol. Currently, 0.

Binary presentation: u8 (unsigned 1-byte integer).
JSON presentation: number.

Service ID

Used when the message is a transaction. Sets the service that a transaction belongs to. The pair (service_id, message_id) is used to look up the implementation of the transaction interface (e.g., verify and execute methods).

Binary presentation: u16 (unsigned 2-byte integer).
JSON presentation: number.

Message ID

message_id defines the type of message within the service.


The sample cryptocurrency service includes 2 main types of transactions: TxCreateWallet for creating a wallet with an initial coins balance, and TxTransfer for coin transfer.

Binary presentation: u16 (unsigned 2-byte integer).
JSON presentation: number.

Payload length

The length of the entire message serialization (including its header and signature).

Binary presentation: u32 (unsigned 4-byte integer).
JSON presentation: (not serialized).


Serialized structure (including its header and body) described on transactions! macro call.


Ed25519 digital signature over the binary serialization of the message (excluding the signature bytes, i.e., the last 64 bytes of the serialization).

Binary presentation: Ed25519 signature (64 bytes).
JSON presentation: hex string.

Example of transactions! Usage

const MY_SERVICE_ID: u16 = 777;

transactions! {
    Transactions {
        const SERVICE_ID = MY_SERVICE_ID;

        struct MessageTwoIntegers {
            first: u64,
            second: u64,

Here the message body is serialized as a struct with fields first and second having type u64.

Types to Be Supported in Future

The current version does not support the serialization of the following types, but it is planned to be implemented in future:


Consider the structure with three fields:

  • pub_key: PublicKey

  • Owner: String

  • Balance: u64

To serialize the structure, one may use macros like this:

#[macro_use] extern crate exonum;
extern crate hex;
use exonum::crypto::PublicKey;
use exonum::storage::StorageValue;
use hex::FromHex;

encoding_struct! {
    struct Wallet {
        pub_key: &PublicKey,
        owner: &str,
        balance: u64,
// `encoding_struct` macro defines a constructor (`new`)
// and field access methods (`pub_key`, `owner`, `balance`) automatically.

let pub_key_str = "99ace6c721db293b0ed5b487e6d6111f\
let pub_key: PublicKey = pub_key_str.parse().unwrap();
let my_wallet = Wallet::new(&pub_key, "Andrew", 1234);

// Check structure content
assert_eq!(*my_wallet.pub_key(), pub_key);
assert_eq!(my_wallet.owner(), "Andrew");
assert_eq!(my_wallet.balance(), 1234);

let expected_buffer_str = pub_key_str.to_owned() + // Public key
                          "30000000" +             // Segment start
                          "06000000" +             // Segment size
                          "d204000000000000" +     // Balance
                          "416e64726577";          // Name
let expected_buffer = Vec::<u8>::from_hex(&expected_buffer_str)
assert_eq!(my_wallet.into_bytes(), expected_buffer);

Serialized representation of my_wallet:

Position Stored data Hexadecimal form Comment
0..32 99 ac e6 c7 21 db 29 3b 0e d5 b4 87 e6 d6 11 1f 22 a8 c5 5d 2a 1b 76 06 b6 fa 6e 6c 29 67 1a a1 Public key
32..36 48 30 00 00 00 A little endian segment pointer that refers to the string position in the serialization buffer
36..40 6 06 00 00 00 A little endian segment size
40..48 1234 d2 04 00 00 00 00 00 00 A number in little endian format
48..54 Andrew 41 6e 64 72 65 77 UTF-8 string converted into a byte array