Assembly functions
Assembly functions (or asm functions for short) are module-level functions that allow you to write Tact assembly. Unlike all other functions, their bodies consist only of TVM instructions and some other primitives, and don’t use any Tact statements or expressions.
// all assembly functions must start with the "asm" keyword// ↓ asm fun answer(): Int { 42 INT }// ------// Notice that the body contains only// TVM instructions and some primitives,// like numbers or bitstrings, which serve// as arguments to the instructions
TVM instructions
In Tact, the term TVM instruction refers to a command that is executed by the TVM during its runtime — the compute phase. Where possible, Tact will try to optimize their use for you, but it won’t define new ones or introduce extraneous syntax for their pre-processing. Instead, it is recommended to combine the best of Tact and TVM instructions, as shown in the onchainSha256()
example near the end of this page.
Each TVM instruction, when converted to its binary representation, is an opcode (operation code) to be executed by the TVM, plus some optional arguments written immediately after it. However, when writing instructions in asm
functions, the arguments, if any, are written before the instruction and are separated by spaces. This reverse Polish notation (RPN) syntax is intended to show the stack-based nature of the TVM.
For example, the DROP2
or its alias 2DROP
, which drop (discard) two top values from the stack, have the same opcode prefix — 0x5B
, or 1011011
in binary.
/// Pushes `a` and `b` onto the stack, then immediately drops them from itasm fun discardTwo(a: Int, b: Int) { DROP2 }
The arguments to TVM instructions in Tact are called primitives—they don’t manipulate the stack themselves and aren’t pushed onto it by themselves. Attempts to specify a primitive without the instruction that immediately consumes it will result in compilation errors.
/// COMPILATION ERROR!/// The 43 was meant to be an argument to some subsequent TVM instruction,/// but none was found.asm fun bad(): Int { 43 }
For some instructions, the resulting opcode depends on the specified primitive. For example, PUSHINT
, or its shorter alias INT
, has the same opcode 0x7
if the specified numerical argument is in the inclusive range from to . However, if the number is outside that range, the opcode changes accordingly: 0x80
for arguments in the inclusive range from to , 0x81
for arguments in the inclusive range from to , and so on. For your convenience, all these variations of opcodes are described using the same instruction name; in this case, PUSHINT
.
asm fun push42(): Int { // The following will be converted to 0x80 followed by 0x2A // in its binary representation for execution by the TVM 42 PUSHINT}
Stack calling conventions
The syntax for parameters and returns is the same as for other function kinds, but there is one caveat — argument values are pushed onto the stack before the function body is executed, and the return type is what is captured from the stack afterward.
Parameters
The first parameter is pushed onto the stack first, the second one second, and so on, so that the first parameter is at the bottom of the stack and the last one at the top.
asm extends fun storeCoins(self: Builder, value: Int): Builder { // ↑ ↑ // | Pushed last, sits on top of the stack // Pushed first, sits at the bottom of the stack
// Stores the value of type `Int as coins` into the Builder, // taking the Builder from the bottom of the stack // and Int from the top of the stack, // producing a new Builder back STVARUINT16}
Since the bodies of asm
functions do not contain Tact statements, any direct references to parameters in function bodies will be recognized as TVM instructions, which can easily lead to very obscure error messages.
/// Simply returns the value of `x`asm fun identity(x: Int): Int { }
/// COMPILATION ERROR!/// The `BOC` is not recognized as a parameter,/// but instead is interpreted as a non-existent TVM instructionasm fun bocchiThe(BOC: Cell): Cell { BOC }
The parameters of arbitrary Struct types are distributed over their fields, recursively flattened as the arguments are pushed onto the stack. In particular, the value of the first field of the Struct is pushed first, the second is pushed second, and so on, so that the value of the first field is at the bottom of the stack and the value of the last is at the top. If there are nested structures inside those Structs, they’re flattened in the same manner.
// Struct with two fields of type Intstruct AB { a: Int; b: Int }
// This will produce the sum of two fields in the `AB` structasm fun sum(two: AB): Int { ADD }
// Struct with two nested `AB` structs as its fieldsstruct Nested { ab1: AB; ab2: AB }
// This will multiply the sums of fields of nested `AB` structsasm fun mulOfSums(n: Nested): Int { ADD -ROT ADD MUL }
// Action!fun showcase() { sum(AB{ a: 27, b: 50 }); // 77 // ↑ ↑ // | Pushed last, sits on top of the stack // Pushed first, sits at the bottom of the stack
mulOfSums(Nested{ ab1: AB{ a: 1, b: 2 }, ab2: AB{ a: 3, b: 4 } }); // 21 // ↑ ↑ ↑ ↑ // | | | Pushed last, // | | | sits on top of the stack // | | Pushed second-to-last, // | | sits below the top of the stack // | Pushed second, // | sits just above the bottom of the stack // Pushed first, sits at the bottom of the stack}
Returns
When present, the return type of an assembly function attempts to capture relevant values from the resulting stack after the function execution and possible stack arrangements. When not present, however, the assembly function does not take any values from the stack.
When present, an assembly function’s return type attempts to grab relevant values from the resulting stack after the function execution and any result arrangements. If the return type is not present, however, the assembly function does not take any values from the stack.
// Pushes `x` onto the stack, increments it there,// but does not capture the result, leaving it on the stackasm fun push(x: Int) { INC }
Specifying a primitive type, such as an Int
or a Cell
, will make the assembly function capture the top value from the stack. If the run-time type of the captured value doesn’t match the specified return type, an exception with exit code 7 will be thrown: Type check error
.
// CAUSES RUN-TIME ERROR WHEN CALLED!// Pushes `x` onto the stack, does nothing else with it,// then tries to capture it as a Cell, causing an exit code 7: Type check errorasm fun push(x: Int): Cell { }
Just like in parameters, arbitrary Struct return types are distributed across their fields and recursively flattened in exactly the same order. The only differences are that they now capture values from the stack and do so in a right-to-left fashion — the last field of the Struct grabs the topmost value from the stack, the second-to-last grabs the second from the top, and so on, so that the last field contains the value from the top of the stack and the first field contains the value from the bottom.
// Struct with two fields of type Intstruct MinMax { minVal: Int; maxVal: Int }
// Pushes `a` and `b` onto the stack,// then captures two values back via the `MinMax` Structasm fun minmax(a: Int, b: Int): MinMax { MINMAX }
If the run-time type of some captured value doesn’t match a specified field type of the Struct or the nested Structs, if any, an exception with exit code 7 will be thrown: Type check error
. Moreover, attempts to capture more values than there were on the stack throw an exception with exit code 2: Stack underflow
.
// Struct with way too many fields for the initial stack to handlestruct Handler { f1: Int; f2: Int; f3: Int; f4: Int; f5: Int; f6: Int; f7: Int }
// CAUSES RUN-TIME ERROR WHEN CALLED!// Tries to capture 7 values from the stack and map them onto the fields of `Handler`,// but there just aren't that many values on the initial stack after TVM initialization,// which causes an exit code 2 to be thrown: Stack underflowasm fun overHandler(): Handler { }
As parameters and return values of assembly functions, Structs can only have up to 16 fields. Each of these fields can in turn be declared as another Struct, where each of these nested structures can also only have up to 16 fields. This process can be repeated until there are a total of 256 fields of primitive types due to the assembly function limitations. This restriction also applies to the parameter list of assembly functions — you can only declare up to 16 parameters.
// Seventeen fieldsstruct S17 { f1:Int; f2:Int; f3:Int; f4:Int; f5:Int; f6:Int; f7:Int; f8:Int; f9:Int; f10:Int; f11:Int; f12:Int; f13:Int; f14:Int; f15:Int; f16:Int; f17:Int }
// COMPILATION ERROR!asm fun chuckles(imInDanger: S17) { }
Stack registers
The so-called stack registers are a way of referring to the values at the top of the stack. In total, there are 256 stack registers, i.e., values held on the stack at any given time. You can specify any of them using s0
, s1
, …, s255
, but only if a particular TVM instruction expects it as an argument. Otherwise, their concept is meant for succinct descriptions of the effects of a particular TVM instruction in text or comments to the code, not in the code itself.
Register s0
is the value at the top of the stack, register s1
is the value immediately after it, and so on, until we reach the bottom of the stack, represented by s255
, i.e., the 256th stack register. When a value x
is pushed onto the stack, it becomes the new s0
. At the same time, the old s0
becomes the new s1
, the old s1
becomes the new s2
, and so on.
asm fun takeSecond(a: Int, b: Int): Int { // ↑ ↑ // | Pushed last, sits on top of the stack // Pushed first, sits second from the top of the stack
// Now, let's swap s0 (top of the stack) with s1 (second-to-top)
// Before │ After // ───────┼─────── // s0 = b │ s0 = a // s1 = a │ s1 = b SWAP
// Then, let's drop the value from the top of the stack
// Before │ After // ───────┼─────── // s0 = a │ s0 = b // s1 = b │ s1 is now either some value deeper or just blank DROP
// At the end, we have only one value on the stack, which is b // Thus, it is captured by our return type `Int`}
fun showcase() { takeSecond(5, 10); // 10, i.e., b}
Arrangements
Often, it’s useful to change the order of arguments pushed onto the stack or the order of return values without referring to stack registers in the body. You can do this with asm
arrangements.
Considering arrangements, the evaluation flow of the assembly function can be thought of in these 5 steps:
- The function takes arguments in the order specified by the parameters.
- If an argument arrangement is present, arguments are reordered before being pushed onto the stack.
- The function body, consisting of TVM instructions and primitives, is executed.
- If a result arrangement is present, resulting values are reordered on the stack.
- The resulting values are captured (partially or fully) by the return type of the function.
The argument arrangement has the syntax asm(arg2 arg1)
, where arg1
and arg2
are some arguments of the function arranged in the order we want to push them onto the stack: arg1
will be pushed first and placed at the bottom of the stack, while arg2
will be pushed last and placed at the top of the stack. Arrangements are not limited to two arguments and operate on all parameters of the function. If there are any parameters of arbitrary Struct types, their arrangement is done prior to their flattening.
// Changing the order of arguments to match the STDICT signature:// `c` will be pushed first and placed at the bottom of the stack,// while `self` will be pushed last and placed at the top of the stackasm(c self) extends fun asmStoreDict(self: Builder, c: Cell?): Builder { STDICT }
The return arrangement has the syntax asm(-> 1 0)
, where 1 and 0 represent a left-to-right reordering of stack registers s1
and s0
, respectively. The contents of s1
will be at the top of the stack, followed by the contents of s0
. Arrangements are not limited to two return values and operate on captured values.
If an arbitrary struct is specified as the return type, the arrangement is made concerning its fields, mapping values on the stack to the recursively flattened struct.
// Changing the order of return values of LDVARUINT16 instruction,// since originally it would place the modified Slice on top of the stackasm(-> 1 0) extends fun asmLoadCoins(self: Slice): SliceInt { LDVARUINT16 }// ↑ ↑// | Value of the stack register 0,// | which is the topmost value on the stack// Value of the stack register 1,// which is the second-to-top value on the stack// And the return type `SliceInt`, which is the following Struct:struct SliceInt { s: Slice; val: Int }
Both argument and return arrangement can be combined together and written as follows: asm(arg2 arg1 -> 1 0)
.
// Changing the order of return values compared to the stack// and switching the order of arguments as wellasm(s len -> 1 0) fun asmLoadInt(len: Int, s: Slice): SliceInt { LDIX }// ↑ ↑// | Value of the stack register 0,// | which is the topmost value on the stack// Value of the stack register 1,// which is the second-to-top value on the stack// And the return type `SliceInt`, which is the following Struct:struct SliceInt { s: Slice; val: Int }
Using all those rearranged functions together, we get:
asm(c self) extends fun asmStoreDict(self: Builder, c: Cell?): Builder { STDICT }asm(-> 1 0) extends fun asmLoadCoins(self: Slice): SliceInt { LDVARUINT16 }asm(s len -> 1 0) fun asmLoadInt(len: Int, s: Slice): SliceInt { LDIX }struct SliceInt { s: Slice; val: Int }
fun showcase() { let b = beginCell() .storeCoins(42) .storeInt(27, 10) .asmStoreDict(emptyMap());
let s = b.asSlice(); let si: SliceInt = s.asmLoadCoins(); // Slice remainder and 42 s = si.s; // assigning the modified Slice let coins = si.val; // 42 let si2: SliceInt = asmLoadInt(10, s); // Slice remainder and 27}
Arrangements do not drop or discard any values — they only manipulate the order of arguments and return values as declared. This means, for example, that an arrangement cannot access values from the stack that are not captured by the return type of the assembly function.
That said, there’s a caveat to the mutates
attribute and asm
arrangements.
Limitations
Attempts to drop the number of stack values below throw an exception with exit code 2: Stack underflow
.
asm fun drop() { DROP }
fun exitCode2() { // Drops far more elements from the stack // than there are, causing an underflow repeat (100) { drop() }}
The TVM stack itself has no limit on the total number of values, so you can theoretically push new values onto it until you run out of gas. However, various continuations may have a maximum number of values defined for their inner stacks, exceeding which will throw an exception with exit code 3: Stack overflow
.
asm fun stackOverflow() { x{} SLICE // s BLESS // c 0 SETNUMARGS // c' 2 PUSHINT // c' 2 SWAP // 2 c' 1 -1 SETCONTARGS // ← this blows up}
fun exitCode3() { // Overflows the inner stack of a continuation stackOverflow();}
Although there are only stack registers, the stack itself can have more than values in total. The deeper values won’t be immediately accessible by any TVM instructions, but they will be on the stack nonetheless.
Caveats
Case sensitivity
TVM instructions are case-sensitive and are always written in upper case (capital letters).
/// ERROR!asm fun bad1(): Cell { mycode }
/// ERROR!asm fun bad2(): Cell { MyCoDe }
/// 👍asm fun good(): Cell { MYCODE }
No double quotes needed
It is not necessary to enclose TVM instructions in double quotes. On the contrary, they will then be interpreted as strings, which is probably not what you want:
// Pushes the string "MYCODE" onto the compile-time stack,// where it gets discarded even before the compute phase startsasm fun wrongMyCode() { "MYCODE" }
// Invokes the TVM instruction MYCODE during the compute phase,// which returns the contract code as a Cellasm fun myCode(): Cell { MYCODE }
mutates
consumes an extra value
Specifying the mutates
attribute, i.e. defining a mutation function, makes the assembly function consume one additional value deeper in the stack than the declared return values. Consider the following example:
asm(-> 1 0) extends mutates fun loadRef(self: Slice): Cell { LDREF }
Here, the LDREF
instruction produces two stack entries: a Cell
and a modified Slice
, in that order, with the Slice
pushed on top of the stack. Then, the arrangement -> 1 0
inverts those values, making the Cell
sit on top of the stack.
Finally, the mutates
attribute makes the function consume the deepest value on the stack, i.e. Slice
, and assigns it to self
, while returning the Cell
value to the caller.
Overall, the mutates
attribute can be useful in some cases, but you must remain vigilant when using it with assembly functions.
Don’t rely on initial stack values
The TVM places a couple of values onto its stack upon initialization, and those values are based on the event that caused the transaction. In other languages, you might need to rely on their order and types, while in Tact the parsing is done for you. Thus, in Tact, these initial stack values are different from what’s described in TON Docs.
Therefore, to access details such as the amount of nanoToncoins in a message or the Address
of the sender, it’s strongly recommended to call the context()
or sender()
functions instead of attempting to look for those values on the stack.
Don’t rely on the order of fields in structures in the standard library
To reduce the number of stack-manipulating statements and to save gas, the order of the fields of structures in the standard library may change between releases. Such changes do not affect any kinds of functions other than asm
functions.
That’s because assembly functions, unlike all others, require a certain order of values on the stack before and after their execution. If this order is disrupted, the entire logic of such functions can be broken immediately.
Hence, when writing assembly functions, make sure you are not using the built-in [Structs] and [Messages] from Tact’s library. If you want to target a specific structure, make a copy of it in your code, and then use that copy so that it won’t be affected by future Tact updates.
Debugging
The number of values the stack has at any given time is called the depth, and it is accessible via the DEPTH
instruction. It’s quite handy for seeing the number of values before and after calling the assembly functions you’re debugging, and it can be used within asm logic.
asm fun depth(): Int { DEPTH }
To see both the stack depth and the values on it, there is a function in the Core library of Tact: dumpStack()
. It’s useful for keeping track of the stack while debugging, although it is computationally expensive and only prints values instead of returning them. Therefore, use it sparingly and only during testing.
Read more about debugging Tact contracts on the dedicated page: Debugging.
Attributes
The following attributes can be specified:
inline
— does nothing, since assembly functions are always inlined.extends
— makes it an extension function.mutates
(along withextends
) — makes it an extension mutation function.
These attributes cannot be specified:
abstract
— assembly functions must have a defined body.virtual
andoverride
— assembly functions cannot be defined within a contract or a trait.get
— assembly functions cannot be getters.
/// `Builder.storeCoins()` extension functionasm extends fun storeCoins(self: Builder, value: Int): Builder { STVARUINT16}
/// `Slice.skipBits()` extension mutation functionasm extends mutates fun skipBits(self: Slice, l: Int) { SDSKIPFIRST}
Interesting examples
On the TVM instructions page, you may have noticed that the “signatures” of instructions are written in a special form called stack notation, which describes the state of the stack before and after a given instruction is executed.
For example, x y - z
describes an instruction that grabs two values x
and y
from the stack, with y
at the top of the stack and x
second to the top, and then pushes the result z
onto the stack. Notice that other values deeper down the stack are not accessed.
This notation omits type information and only implicitly describes the state of stack registers, so for the following examples we’ll use a different one, combining the notions of parameters and return values with the stack notation like this:
// The types of parameters// | | and types of return values are shown// ↓ ↓ ↓// x:Int, y:Int → z:Int — all comma-separated// ————————————————————// s1 s0 → s0// ↑ ↑ ↑// And the stack registers are shown too,// which helps visually map them onto parameters and return values
When literals are involved, they’ll be shown as is. Additionally, when values on the stack do not represent parameters or Struct fields of the return type, only their type is given.
keccak256
// Computes and returns the Keccak-256 hash as a 256-bit unsigned `Int`// from a passed `Slice` `s`. Uses an Ethereum-compatible* implementation.asm fun keccak256(s: Slice): Int { // s:Slice → s:Slice, 1 // ————————————————————— // s0 → s1 s0 ONE
// s:Slice, 1 → h:Int // ——————————————————— // s1 s0 → s0 HASHEXT_KECCAK256}
The HASHEXT_SHA256
and HASHEXT_BLAKE2B
instructions can be used in a similar manner with respect to a different number of return values. In addition, all of those can also work with values of type Builder
.
The HASHEXT_KECCAK512
and HASHEXT_SHA512
, however, put a tuple of two integers on the stack instead of putting two separate integers there. Because of that, you need to also add the UNPAIR
instruction right after them.
// Computes and returns the Keccak-512 hash in two 256-bit unsigned `Int`// values from a passed `Slice` `s`. Uses an Ethereum-compatible* implementation.asm fun keccak512(s: Slice): Hash512 { // s:Slice → s:Slice, 1 // ————————————————————— // s0 → s1 s0 ONE
// s:Slice, 1 → Tuple(h1:Int, h2:Int) // ——————————————————————————————————— // s1 s0 → s0 HASHEXT_KECCAK512
// Tuple(h1:Int, h2:Int) → h1:Int, h2:Int // ————————————————————————————————————— // s0 → s1 s2 UNPAIR // could've used UNTUPLE in a more general case too}
// Helper Structstruct Hash512 { h1: Int; h2: Int }
While it is said that these sample keccak256()
and keccak512()
functions use an Ethereum-compatible implementation, note that the underlying HASHEXT
family of TVM instructions has its own drawbacks.
These drawbacks stem from the limitations of the Slice
type itself — HASHEXT_KECCAK256
and other hashing instructions of the HASHEXT
family ignore any references present in the passed slice(s). That is, only up to 1023 bits of their data are used.
To work around this, you can recursively load all the refs from the given Slice
and then hash them all at once by specifying their exact number instead of the ONE
TVM instruction used earlier. See an example below: onchainSha256
.
isUint8
Mapping onto a single instruction by itself is inefficient if the values it places onto the stack can vary depending on certain conditions. That is because one cannot map them to Tact types directly and often needs additional stack manipulations before or after their execution.
Since this is often the case for the “quiet” versions of instructions, the recommendation is to prefer their non-quiet alternatives. Usually, non-quiet versions throw exceptions and are consistent in their return values, while quiet ones push or other values onto the stack, thus varying the number or the type of their result values.
For simpler cases such as this example, it is convenient to do all the stack manipulations within the same function.
// Checks if the given `Int` `val` is in// the inclusive range from 0 to 255asm fun isUint8(val: Int): Bool { // val:Int → val:Int or NaN // ———————————————————————— // s0 → s0 8 QUFITS
// val:Int or NaN → Bool // ————————————————————— // s0 → s0 ISNAN
// Since ISNAN gives true when the `val` is NaN, // i.e., when the `val` did not fit into the uint8 range, // we need to flip it
// Bool → Bool // ——————————— // s0 → s0 NOT // could've used 0 EQINT too}
fun showcase() { isUint8(55); // true isUint8(-55); // false isUint8(pow(2, 8)); // false isUint8(pow(2, 8) - 1); // true}
ecrecover
This example shows one possible way to work with partially captured results from the stack, obtaining the omitted ones later.
// Recovers a public key from the signature as done in Bitcoin or Ethereum.//// Takes the 256-bit unsigned integer `hash` and the 65-byte signature consisting of:// * 8-bit unsigned integer `v`// * 256-bit unsigned integers `r` and `s`//// Returns `null` on failure or an `EcrecoverKey` structure on success.fun ecrecover(hash: Int, v: Int, r: Int, s: Int): EcrecoverKey? { let successful = _ecrecoverExecute(hash, v, r, s); if (successful) { return _ecrecoverSuccess(); } else { return null; }}
// The 65-byte public key returned by `ecrecover()` in case of success,// which consists of the 8-bit unsigned integer `h`// and 256-bit unsigned integers `x1` and `x2`.struct EcrecoverKey { h: Int as uint8; x1: Int as uint256; x2: Int as uint256;}
// Underlying assembly function that does the work// and only captures the topmost value from the stack.//// Since the `ECRECOVER` instruction places 0 on top of the stack// in case of failure and -1 in case of success,// this maps nicely onto the Bool type.asm fun _ecrecoverExecute(hash: Int, v: Int, r: Int, s: Int): Bool { ECRECOVER }
// Simply captures the values from the stack// if the call to `_ecrecoverExecute()` was successful.asm fun _ecrecoverSuccess(): EcrecoverKey { }
onchainSha256
This example extends the ecrecover()
example and adds more complex stack management and interaction with Tact statements, such as loops.
// Calculates and returns the SHA-256 hash// as a 256-bit unsigned `Int` of the given `data`.// Unlike the `sha256()` function from the Core library,// this one works purely on-chain (at runtime), hashing the strings completely,// whereas the `sha256()` reliably works only with their first 1023 bits of data.fun onchainSha256(data: String): Int { _onchainShaPush(data); while (_onchainShaShouldProceed()) { _onchainShaOperate(); } return _onchainShaHashExt();}
// Helper assembly functions,// each manipulating the stack in its own way// in different parts of the `onchainSha256()` function.asm fun _onchainShaPush(data: String) { ONE }asm fun _onchainShaShouldProceed(): Bool { OVER SREFS 0 NEQINT }asm fun _onchainShaOperate() { OVER LDREF s0 POP CTOS s0 s1 XCHG INC }asm fun _onchainShaHashExt(): Int { HASHEXT_SHA256 }