Binary Canonical Serialization
Binary Canonical Serialization (BCS) is a binary encoding format for structured data. It was originally designed in Diem, and became the standard serialization format for Move. BCS is simple, efficient, deterministic, and easy to implement in any programming language.
The full format specification is available in the BCS repository.
Format
BCS is a binary format that supports unsigned integers up to 256 bits, options, booleans, unit (empty value), fixed and variable-length sequences, and maps. The format is designed to be deterministic, meaning that the same data will always be serialized to the same bytes.
"BCS is not a self-describing format. As such, in order to deserialize a message, one must know the message type and layout ahead of time" from the README
Integers are stored in little-endian format, and variable-length integers are encoded using a variable-length encoding scheme. Sequences are prefixed with their length as ULEB128, enumerations are stored as the index of the variant followed by the data, and maps are stored as an ordered sequence of key-value pairs.
Structs are treated as a sequence of fields, and the fields are serialized in the order they are defined in the struct. The fields are serialized using the same rules as the top-level data.
Using BCS
The Sui Framework includes the sui::bcs module for encoding and decoding data. Encoding functions are native to the VM, and decoding functions are implemented in Move.
Encoding
To encode data, use the bcs::to_bytes
function, which converts data references into byte vectors. This function supports encoding any types, including structs.
// File: move-stdlib/sources/bcs.move
public native fun to_bytes<T>(t: &T): vector<u8>;
The following example shows how to encode a struct using BCS. The to_bytes
function can take any
value and encode it as a vector of bytes.
use sui::bcs;
// 0x01 - a single byte with value 1 (or 0 for false)
let bool_bytes = bcs::to_bytes(&true);
// 0x2a - just a single byte
let u8_bytes = bcs::to_bytes(&42u8);
// 0x2a00000000000000 - 8 bytes
let u64_bytes = bcs::to_bytes(&42u64);
// address is a fixed sequence of 32 bytes
// 0x0000000000000000000000000000000000000000000000000000000000000002
let addr = bcs::to_bytes(&@sui);
Encoding a Struct
Structs encode similarly to simple types. Here is how to encode a struct using BCS:
let data = CustomData {
num: 42,
string: b"hello, world!".to_string(),
value: true
};
let struct_bytes = bcs::to_bytes(&data);
let mut custom_bytes = vector[];
custom_bytes.append(bcs::to_bytes(&42u8));
custom_bytes.append(bcs::to_bytes(&b"hello, world!".to_string()));
custom_bytes.append(bcs::to_bytes(&true));
// struct is just a sequence of fields, so the bytes should be the same!
assert!(&struct_bytes == &custom_bytes, 0);
Decoding
Because BCS does not self-describe and Move is statically typed, decoding requires prior knowledge of the data type. The sui::bcs
module provides various functions to assist with this process.
Wrapper API
BCS is implemented as a wrapper in Move. The decoder takes the bytes by value, and then allows the
caller to peel off the data by calling different decoding functions, prefixed with peel_*
. The
data is split off the bytes, and the remainder bytes are kept in the wrapper until the
into_remainder_bytes
function is called.
use sui::bcs;
// BCS instance should always be declared as mutable
let mut bcs = bcs::new(x"010000000000000000");
// Same bytes can be read differently, for example: Option<u64>
let value: Option<u64> = bcs.peel_option_u64();
assert!(value.is_some(), 0);
assert!(value.borrow() == &0, 1);
let remainder = bcs.into_remainder_bytes();
assert!(remainder.length() == 0, 2);
There is a common practice to use multiple variables in a single let
statement during decoding. It
makes code a little bit more readable and helps to avoid unnecessary copying of the data.
let mut bcs = bcs::new(x"0101010F0000000000F00000000000");
// mind the order!!!
// handy way to peel multiple values
let (bool_value, u8_value, u64_value) = (
bcs.peel_bool(),
bcs.peel_u8(),
bcs.peel_u64()
);
Decoding Vectors
While most of the primitive types have a dedicated decoding function, vectors need special handling, which depends on the type of the elements. For vectors, first you need to decode the length of the vector, and then decode each element in a loop.
let mut bcs = bcs::new(x"0101010F0000000000F00000000000");
// bcs.peel_vec_length() peels the length of the vector :)
let mut len = bcs.peel_vec_length();
let mut vec = vector[];
// then iterate depending on the data type
while (len > 0) {
vec.push_back(bcs.peel_u64()); // or any other type
len = len - 1;
};
assert!(vec.length() == 1, 0);
For most common scenarios, bcs
module provides a basic set of functions for decoding vectors:
peel_vec_address(): vector<address>
peel_vec_bool(): vector<bool>
peel_vec_u8(): vector<u8>
peel_vec_u64(): vector<u64>
peel_vec_u128(): vector<u128>
peel_vec_vec_u8(): vector<vector<u8>>
- vector of byte vectors
Decoding Option
Option is represented as a vector of either 0 or 1 element. To read an option, you would treat it like a vector and check its length (first byte - either 1 or 0).
let mut bcs = bcs::new(x"00");
let is_some = bcs.peel_bool();
assert!(is_some == false, 0);
let mut bcs = bcs::new(x"0101");
let is_some = bcs.peel_bool();
let value = bcs.peel_u8();
assert!(is_some == true, 1);
assert!(value == 1, 2);
If you need to decode an option of a custom type, use the method in the code snippet above.
The most common scenarios, bcs
module provides a basic set of functions for decoding Option's:
peel_option_address(): Option<address>
peel_option_bool(): Option<bool>
peel_option_u8(): Option<u8>
peel_option_u64(): Option<u64>
peel_option_u128(): Option<u128>
Decoding Structs
Structs are decoded field by field, and there is no standard function to automatically decode bytes into a Move struct, and it would have been a violation of the Move's type system. Instead, you need to decode each field manually.
// some bytes...
let mut bcs = bcs::new(x"0101010F0000000000F00000000000");
let (age, is_active, name) = (
bcs.peel_u8(),
bcs.peel_bool(),
bcs.peel_vec_u8().to_string()
);
let user = User { age, is_active, name };
Summary
Binary Canonical Serialization is an efficient binary format for structured data, ensuring consistent serialization across platforms. The Sui Framework provides comprehensive tools for working with BCS, allowing extensive functionality through built-in functions.