← Back to Blog

Ethereum Attackathon — Vyper Under the Microscope

by anatomist

Introduction

In this blog, we will walk through the findings we reported during the Ethereum Attackathon. The attackathon had a 1.5M reward pool, but only 0.5M is unlocked. These bugs totaled nearly 150K in rewards, which allegedly earned us 1st place in the Attackathon. The reports have already been published (that's how we guessed our ranking), but the official results aren't out yet. So we figured, why not publish a blog in the meantime? 😄

Anatomist ethereum attackathon status

In this post, we will dive into the technical details of the Vyper compiler, as well as the issues we discovered, including miscompilations, faulty optimizations, and edge cases that may lead to incorrect or insecure contract behavior. By the end, you will have a deeper understanding of how various stages of the Vyper compiler operate, along with some practical insights on avoiding common security pitfalls.

We've selected 5 of our findings to share here. And who knows, maybe you’ll spot them next time and claim the rewards for yourself 🙂

Incorrect HexString Parsing Leads To Compilation Error Or Type Confusion

First, let’s look at the frontend phase of the Vyper compilation process, specifically the transition from source code to abstract syntax tree (AST). Vyper recently introduced a custom syntax for hexadecimal byte strings, written as x"..." (e.g., x"6161"). This is treated as a Bytes[] literal. However, due to x"..." not being valid Python code, a pre-parsing step is required to adjust the code before it is passed to the python.ast module.

The problem arises because position adjustments are not properly tracked.The HexStringParser class is responsible for stripping the leading x from the syntax (so x"6161" becomes "6161") and tracking the positions of these occurrences in hex_string_locations.

1class HexStringParser: 2 def __init__(self): 3 self.locations = [] 4 self._current_x = None 5 self._state = ParserState.NOT_RUNNING 6 7 def consume(self, token, result): 8 # prepare to check if the next token is a STRING 9 if token.type == NAME and token.string == "x": 10 self._state = ParserState.RUNNING 11 self._current_x = token 12 return True 13 14 if self._state == ParserState.NOT_RUNNING: 15 return False 16 17 if self._state == ParserState.RUNNING: 18 current_x = self._current_x 19 self._current_x = None 20 self._state = ParserState.NOT_RUNNING 21 22 toks = [current_x] 23 24 # drop the leading x token if the next token is a STRING to avoid a python 25 # parser error 26 if token.type == STRING: 27 self.locations.append(current_x.start) 28 toks = [TokenInfo(STRING, token.string, current_x.start, token.end, token.line)] 29 result.extend(toks) 30 return True 31 32 result.extend(toks) 33 34 return False

After parsing, when the compiler visits AST nodes, it relies on hex_string_locations to distinguish between Bytes and Str types.

1def visit_Constant(self, node): 2 ... 3 elif isinstance(node.value, str): 4 key = (node.lineno, node.col_offset) 5 if key in self._pre_parser.hex_string_locations: 6 if len(node.value) % 2 != 0: 7 raise SyntaxException( 8 "Hex string must have an even number of characters", 9 self._source_code, 10 node.lineno, 11 node.col_offset, 12 ) 13 node.ast_type = "HexBytes" 14 else: 15 node.ast_type = "Str" 16 ...

However, if previous code transformations have altered the source locations, the stored positions in self.locations become incorrect. The pre-parsing of HexString may be positioned in a location where previous code modifications have shifted the expected occurrence positions, and it may also remove the leading x, which can further shift the expected code locations.

These issues can lead to several consequences, including: legal code failing to compile, type confusion when combined with user errors in type specification, and illegal code compiling successfully when it should not. We will explore an example of each scenario:

Compilation Failure

If the offset tracking is incorrect, the compiler may fail to recognize HexStrings properly, causing compilation failures for otherwise valid code.

1event X: 2 a: Bytes[2] 3 4@deploy 5def __init__(): 6 log X(a = x"6161") #log changes offset of HexString, and the hex_string_locations tracked location is incorrect when visiting ast

The position of x"6161" shifts due to log X(a = ...), causing a mismatch in hex_string_locations, and resulting in a compilation error

Type Confusion + Incorrect Argument Interpretation

A more significant issue is that code that should fail to compile may instead compile successfully, leading to both type confusion and incorrect value interpretation. For example, the following code should not compile, but it does, resulting in "6161" being incorrectly passed as the second argument to FooBar.test instead of b"\x61\x61".

1interface FooBar: 2 def test(a: Bytes[2], b: String[4]): payable 3 4@deploy 5def __init__(ext: FooBar): 6 extcall ext.test(x'6161', x'6161') #ext.test(b'\x61\x61', '6161') gets called

Allowing Illegal Code to Compile

Lastly, the state machine of HexString pre-parser is also flawed. It can allow invalid code as the following to pass compilation.

1@deploy 2def __init__(): 3 a: Bytes[2] = x x x x x"6161"

AugAssign evaluation order causing OOB write within the object

Next, we move on to the AST to intermediate represenation (IR) phase. Our second finding falls into a long-standing and notoriously subtle class of compiler bugs: incorrect evaluation order and improper handling of side effects. This issue has led to major vulnerabilities across many compilers and interpreters in the past, and Vyper is not exempt.

In Vyper, this bug arises in Augmented Assignment (AugAssign) statements (i.e., a += b, a -= b, etc.) These statements do not properly account for side effects caused by the right-hand side (RHS) of the operation. This can lead to out-of-bounds writes in dynamic arrays, potentially allowing unintended memory modifications.

For example, when parsing an AugAssign statement, Vyper first caches the left-hand side (LHS) target (e.g., an array element) before evaluating the right-hand side (RHS) operation. This is to prevent the target from being evaluated twice — for example, in a = a + x, caching a avoids the expression a to be double evaluated.

1def parse_AugAssign(self): 2 target = self._get_target(self.stmt.target) 3 right = Expr.parse_value_expr(self.stmt.value, self.context) 4 5 if not target.typ._is_prim_word: 6 # because of this check, we do not need to check for 7 # make_setter references lhs<->rhs as in parse_Assign - 8 # single word load/stores are atomic. 9 raise TypeCheckFailure("unreachable") 10 11 with target.cache_when_complex("_loc") as (b, target): 12 left = IRnode.from_list(LOAD(target), typ=target.typ) 13 new_val = Expr.handle_binop(self.stmt.op, left, right, self.context) 14 return b.resolve(STORE(target, new_val))

The vulnerability occurs when the target is an element of dynamic arrays (DynArray) and the right-hand side (RHS) operation modifies the array (e.g., using .pop()). Since Vyper caches the target reference before evaluating RHS, any changes to the array size make the cached reference stale. This outdated reference causes writes to an invalid or out-of-bounds index, as the array length has changed while the stored reference remains unchanged.

As a result, the final STORE(target, new_val) may write beyond the valid bounds of the array, leading to flawed intermediate representation (IR) code, causing unexpected behavior in smart contracts.

This issue is not specific to Vyper, but instead representative of a broader class of bugs that arise when compilers assume side-effect-free expressions without verifying them. Past examples include:

Due to its subtlety and the wide-reaching consequences it can have in low-level code generation, this bug class is often difficult to detect and prevent—especially in languages that generate IR or bytecode for safety-critical environments like blockchains.

Incorrectly Eliminated Code With Side Effect in Concat Args

The next vulnerability is still in the AST to IR phase. But this time, instead of AST to IR translation, the focus will be on optimizations. In Vyper's copy_bytes function, certain expressions are eliminated during compilation. If length == 0 or length_bound == 0, the copy_bytes function immediately discards all operations without checking for potential side effects. Expressions with side effects may consequently be optimized away.

For example, arguments in concat() can be eliminated, leading to unintended behavior in the compiled code. Let's explore the details of the implementation.

The Concat built-in function uses copy_bytes to copy arguments into a destination buffer. Within build_IR(), each arg is processed as follows:

  • If the argument is a bytestring, its data is extracted (bytes_data_ptr(arg)).
  • Its length is determined (get_bytearray_length(arg)).
  • The copy_bytes() function is then called to transfer the data.
1class Concat(BuiltinFunctionT): 2 ... 3 def build_IR(self, expr, context): 4 ... 5 for arg in args: 6 dst_data = add_ofst(bytes_data_ptr(dst), ofst) 7 8 if isinstance(arg.typ, _BytestringT): 9 # Ignore empty strings 10 if arg.typ.maxlen == 0: 11 continue 12 13 with arg.cache_when_complex("arg") as (b1, arg): 14 argdata = bytes_data_ptr(arg) 15 16 with get_bytearray_length(arg).cache_when_complex("len") as (b2, arglen): 17 do_copy = [ 18 "seq", 19 copy_bytes(dst_data, argdata, arglen, arg.typ.maxlen), #utilize copy_bytes 20 ["set", ofst, ["add", ofst, arglen]], 21 ] 22 ret.append(b1.resolve(b2.resolve(do_copy))) 23 24 ...

Notably, copy_bytes may short circuit and discard the entire argument when length_bound or length is a constant 0.

1def copy_bytes(dst, src, length, length_bound): 2 annotation = f"copy up to {length_bound} bytes from {src} to {dst}" 3 4 ... 5 6 with src.cache_when_complex("src") as (b1, src), length.cache_when_complex( 7 "copy_bytes_count" 8 ) as (b2, length), dst.cache_when_complex("dst") as (b3, dst): 9 assert isinstance(length_bound, int) and length_bound >= 0 10 11 # correctness: do not clobber dst 12 if length_bound == 0: 13 return IRnode.from_list(["seq"], annotation=annotation) #short circuit 1 14 # performance: if we know that length is 0, do not copy anything 15 if length.value == 0: 16 return IRnode.from_list(["seq"], annotation=annotation) #short circuit 2

This behavior is normally safe because _BytestringT variables must be declared with a size greater than zero. However, "literal-like arguments" (e.g., ternary expressions like b"" if True else b"") do not enjoy the same invariant, and may have a size of length of 0. When these literals are passed directly to a function (rather than being assigned to a variable first), the compiler wrongly assumes they are redundant and eliminates them.

Consider this code snippet:

1x: bool 2 3def test(): 4 a: Bytes[256] = concat(b"" if self.sideeffect() else b"", b"aaaa") 5 6def sideeffect() -> bool: 7 self.x += 1 8 return True
  • Here, b"" if self.sideeffect() else b"" is passed directly to concat().
  • Since this conditional expression is not assigned to a variable first, its length is 0.
  • copy_bytes() then short-circuits, eliminating the first argument (b"" if self.sideeffect() else b"").
  • As a result, the sideeffect() function never executes, meaning self.x is not incremented. The essential side effect is neglected, altering program behavior.

This issue reflects a recurring pattern in Vyper where certain optimizations are applied too early in the compilation pipeline, leading to occasional incorrect bytecode emission. While such cases may appear uncommon and depend on less typical coding patterns (i.e., require developers to write "strange code"), they suggest underlying challenges in the current IR optimization strategy.

While this specific case may not immediately introduce severe vulnerabilities, it highlights the importance of properly modeling operation effects before applying optimizations, as incorrect elimination of expressions and could lead to skipping critical security checks.

IRNode Multi-Evaluation is For list Iter

We're getting to the end! Our next issue belongs to the final stage of compilation — IR to assembly. This is a multi-evaluation issue in Vyper's for-loop handling, specifically when iterating over dynamic and static arrays (DArrayT and SArrayT). The problem arises because Vyper does not always cache the iterable expression before evaluation, leading to unintended repeated evaluations, especially when the iterable contains side effects (e.g., function calls that modify state).

Let's break down the code and the explanation of the issue in more detail.

Vyper supports two types of iterables for the for loop: range and iterable types like SArrayT (static arrays) and DArrayT (dynamic arrays). The issue is specifically related to iterating over iterable types (i.e., lists).

1def _analyse_list_iter(self, iter_node, target_type): 2 # iteration over a variable or literal list 3 iter_val = iter_node.reduced() 4 5 if isinstance(iter_val, vy_ast.List): 6 len_ = len(iter_val.elements) 7 if len_ == 0: 8 raise StructureException("For loop must have at least 1 iteration", iter_node) 9 iter_type = SArrayT(target_type, len_) 10 else: 11 try: 12 iter_type = get_exact_type_from_node(iter_node) 13 except (InvalidType, StructureException): 14 raise InvalidType("Not an iterable type", iter_node) 15 16 if not isinstance(iter_type, (DArrayT, SArrayT)): 17 raise InvalidType("Not an iterable type", iter_node) 18 ...

This code checks the type of the iterable passed to the for-loop. If it’s a list (like [1, 2, 3]), it checks the length of the list and validates that it’s an acceptable iterable type.

Vyper tries to force the iteration over lists to be constant, i.e., not modify state or calling functions that change values.

1def _parse_For_list(self): 2 with self.context.range_scope(): 3 iter_list = Expr(self.stmt.iter, self.context).ir_node 4 ...
1def range_scope(self): 2 prev_value = self.in_range_expr 3 self.in_range_expr = True 4 yield 5 self.in_range_expr = prev_value 6 7def is_constant(self): 8 return self.constancy is Constancy.Constant or self.in_range_expr

However, this only ensures that iterables does not introduce side effects, but doesn't stop them from consuming side effects.

The main issue occurs in _parse_For_list. When parsing the iterable in a for-loop, Vyper does not cache the iterator expression (iter_list) before it’s used. If the expression is evaluated multiple times in different parts of the loop, side effects can be consumed multiple times, leading to unexpected behavior.

1def _parse_For_list(self): 2 with self.context.range_scope(): 3 iter_list = Expr(self.stmt.iter, self.context).ir_node 4 5 ... 6 7 # set up the loop variable 8 9 # BUG: iter_list first use 10 e = get_element_ptr(iter_list, i, array_bounds_check=False) 11 body = ["seq", make_setter(loop_var, e), parse_body(self.stmt.body, self.context)] 12 13 ... 14 15 if isinstance(iter_list.typ, DArrayT): 16 # BUG: iter_list second use (DArrayT) 17 array_len = get_dyn_array_count(iter_list) 18 else: 19 ... 20 21 ret.append(["repeat", i, 0, array_len, repeat_bound, body])

The iter_list is not cached properly before its first use. In the case of a DArrayT, the iter_list expression is evaluated multiple times. This can lead to multi-evaluation, where side effects from the iterable are consumed multiple times.

It is clear that double evaluation can occur with DArrayT, and the following PoC confirms this.

1x: DynArray[uint256, 3] 2 3@external 4def test(): 5 for i: uint256 in (self.usesideeffect() if True else self.usesideeffect()): 6 pass 7 8def usesideeffect() -> DynArray[uint256, 3]: 9 return self.x

In fact, double evaluation has had a long history in vyper, to the point that a corresponding defense-in-depth mechanism is introduced. unique_symbols are IR nodes inserted into risky IR building blocks (e.g. function calls). The idea is that if these IR nodes are accidently cloned and inserted multiple times when building bigger IR blocks, a single unique_symbol IR node will appear more than once in the final IR. Therefore, we can catch incorrect multi-evaluations and have the compiler bail out instead of outputting incorrect bytecode.

1elif code.value == "unique_symbol": 2 symbol = code.args[0].value 3 assert isinstance(symbol, str) 4 5 if symbol in existing_labels: 6 raise Exception(f"symbol {symbol} already exists!") 7 else: 8 existing_labels.add(symbol) 9 10 return []

Thanks for the check, the PoC we just provided triggers a compiler panic instead.

Crisis adverted? Not quite. Now, let's examine the SArrayT case. Unlike DArrayT, where double evaluation is more obvious, SArrayT only instantiates iter_list once. However, since this instantiation occurs within the body of a repeat IR, it can still be evaluated multiple times. The unique_symbol check fails to catch this issue because it only traverses the IR tree recursively, without accounting for the IR node semantics, and how it will be translated into evm bytecode for runtime execution.

1def _parse_For_list(self): 2 ... 3 4 # iter_list first use 5 e = get_element_ptr(iter_list, i, array_bounds_check=False) 6 # place IRnode `e` into loop body 7 body = ["seq", make_setter(loop_var, e), parse_body(self.stmt.body, self.context)] 8 9 repeat_bound = iter_list.typ.count 10 if isinstance(iter_list.typ, DArrayT): 11 ... 12 else: 13 array_len = repeat_bound 14 15 ret.append(["repeat", i, 0, array_len, repeat_bound, body])

How does this impact execution? Let’s break it down with three examples.

Example 1: Pre-evaluation and Storage in tmp_list

In the first example, the iteration list is pre-evaluated before the loop begins and stored in a temporary variable (tmp_list). This ensures that the list is evaluated only once, preventing multiple evaluations during loop execution. As a result, the log output remains 0, 0, 0 since self.usesideeffect() is called before the loop and does not change during iteration.

1# Example 1 2event I: 3 i: uint256 4 5x: uint256 6 7@deploy 8def __init__(): 9 self.x = 0 10 11@external 12def test(): 13 for i: uint256 in [self.usesideeffect(), self.usesideeffect(), self.usesideeffect()]: 14 self.x += 1 15 log I(i) 16 17@view 18def usesideeffect() -> uint256: 19 return self.x

This behavior is expected because the loop iterates over a fixed list that has already been computed, ensuring consistency in the values being logged.

Example 2: Lazy Evaluation with ifexp

In the second example, the iter_list is defined as an if expression (ifexp), meaning it is evaluated inside the loop rather than before it. This results in lazy evaluation, where each iteration triggers a fresh evaluation of self.usesideeffect(), consuming its side effects.

1# Example 2 2event I: 3 i: uint256 4 5x: uint256 6 7@deploy 8def __init__(): 9 self.x = 0 10 11@external 12def test(): 13 for i: uint256 in ([self.usesideeffect(), self.usesideeffect(), self.usesideeffect()] if True else self.otherclause()): 14 self.x += 1 15 log I(i) 16 17@view 18def usesideeffect() -> uint256: 19 return self.x 20 21@view 22def otherclause() -> uint256[3]: 23 return [0, 0, 0]

As a consequence, the value of self.x increases on each iteration before being logged, leading to an output of 0, 1, 2 instead of a constant value. This behavior differs from the first example, where iter_list was precomputed and stored.

This highlights the potential risks of multi-evaluation in loops, where an iterable containing side-effect-producing expressions can yield different results based on when and how it is evaluated.

Example 3: Another Case of Lazy Evaluation with ifexp

Similar to the second example, the iter_list in this case is defined as an if expression (ifexp), meaning it is evaluated inside the loop. Because of this, the loop re-evaluates self.usesideeffect() on each iteration, consuming its side effects dynamically.

1# Example 3 2event I: 3 i: uint256 4 5x: uint256[3] 6 7@deploy 8def __init__(): 9 self.x = [0, 0, 0] 10 11@external 12def test(): 13 for i: uint256 in (self.usesideeffect() if True else self.otherclause()): 14 self.x[0] += 1 15 self.x[1] += 1 16 self.x[2] += 1 17 log I(i) 18 19@view 20def usesideeffect() -> uint256[3]: 21 return self.x 22 23@view 24def otherclause() -> uint256[3]: 25 return [0, 0, 0] 26

However, unlike the previous case where self.x was a single integer, here self.x is an array (uint256[3]), and each iteration modifies multiple elements of the array before logging. As a result, the values change in a stepwise manner, leading to the log output of 0, 1, 2.

This reinforces the issue of inconsistent evaluation behavior, where an iterable in a Vyper for loop may either be evaluated once at the beginning or multiple times inside the loop, leading to unpredictable results depending on how the iterable interacts with side effects.

The main impact of this bug is unexpected program behavior. If the iterable in a for-loop contains side effects, those side effects may be applied multiple times, leading to inconsistent and potentially harmful outcomes, such as:

  • Inconsistent state changes in the smart contract.
  • Unexpected financial impacts if the contract interacts with tokens or other state variables.
  • Security vulnerabilities if side effects are critical to the contract's logic or access control.

Incorrect Sqrt Calculation Result

Finally, getting to the application level. The last finding we'd like to share is an incorrect rounding issue in the sqrt builtin function.

Vyper employs the Babylonian method to compute the decimal sqrt(x), with the expectation that it will converge within 256 iterations as outlined below:

1assert x >= 0.0 2z: decimal = 0.0 3 4if x == 0.0: 5 z = 0.0 6else: 7 z = x / 2.0 + 0.5 8 y: decimal = x 9 10 for i: uint256 in range(256): 11 if z == y: 12 break 13 y = z 14 z = (x / z + z) / 2.0

However, if we take a closer look to the Babylonian algorithm, we observe that due to the precision limitations of floating point numbers, it doesn't always guarantee convergence to a single value and can instead oscillate between 2 values, meaning the algorithm alternates between the 2 values without settling on a single, precise solution.

For certain inputs, the value of may oscillate between and , where (with being the smallest precision that floating point numbers can handle). To better illustrate this, let's dive into the following analysis.

When we have , the possible are:

  1. The Babylonian iteration formula:
  1. Substituting
  1. Expanding and
  1. Simplifying both bounds
  1. So, the possible candidates for are:

When we have , the range of the next possible are:

  1. The Babylonian iteration formula:
  1. Substituting
  1. Expanding and
  1. Simplifying both bounds
  1. So, the possible candidates for are:

Here, we see that oscillating outputs are theoretically possible. To further prove this, we provide a concrete example where such oscillation occurs.

The snippet here returns 0.9999999999, the rounded up result for sqrt(0.9999999998). This is due to the oscillation ending in N+ϵ instead of the correct N.

1@external 2def test(): 3 d: decimal = 0.9999999998 4 r: decimal = sqrt(d) #this will be 0.9999999999

Rounding issues are a well-known class of bugs occasionally leading to high-profile attacks. For example, Uni-v3 style Automated Market Makers (AMMs) could be vulnerable to rounding attacks that nudge ticks across tick boundary to corrupt liquidity trackers. For those interested, here are a few references to bugs related to rounding based tick manipulation issues.

As mentioned earlier, incorrect rounding issues can lead to high-profile attacks. But what are some real-world examples? Here are a couple of notable cases:-->

  • The KyberSwap Incident: A small error similar to our finding—rounding in the wrong direction—led to attacks that resulted in a total loss of over $48M.
  • The Raydium Tick Manipulation Bounty: A flaw in liquidity management allowing attackers to inflate liquidity and extract disproportionate amounts of tokens from the system. The bounty of $505K was awarded to the whitehat for this discovery.

Ending Words

In this blog, we went through 5 vulnerabilities we found in the Ethereum Attackathon, spanning the stages of Vyper compilation—including source parsing, AST construction, IR generation, optimization, assembly emission, and application-level effects. These issues highlight how subtle compiler bugs—ranging from frontend parsing quirks to backend IR and assembly generation flaws—can lead to security implications in smart contracts. By dissecting each stage, we aim to emphasize the need for rigorous compiler auditing and offer developers deeper insights into the risks of relying on compiler correctness in high-stakes environments like Ethereum.

As we wrap up, it’s worth reflecting on what these findings really tell us—not just about Vyper, but about smart contract development as a whole. Vyper has made commendable efforts to prioritize safety and auditability in its design. Still, compiler vulnerabilities are notoriously subtle, and once embedded, they’re hard to detect and costly to mitigate. That’s why the most effective security work starts before any code is written. Prevention—through careful design and informed architectural decisions—is almost always better than cure.

We are Anatomist Security. If you're concerned about security of your contracts, blockchains, compilers or need help assessing potential risks, reach out to us at [email protected], our official website or via our X account. We provide comprehensive and meticulous audits, as well as architectural consulting to help teams build secure systems from the ground up.