Flash CTF - Who's Counting

Challenge overview

The binary counting reads 32 bytes from standard input and prints either:

Correct! if every per-byte check succeeds
Wrong! otherwise

Internally it maintains a global 64-bit state in rbx. For each input byte i (0 through 31) it runs a deterministic transformation that mutates rbx. Then it compares the resulting rbx against a built-in table of 32 expected values stored in .data. If the comparison matches for all indices, it prints Correct!.

Quick triage

From the command line:

file counting

It is an x86-64 ELF, statically linked.

To view the important code and the expected table, we can use objdump:

objdump -d -Mintel ./counting objdump -s -j .data ./counting

What the binary does (start to finish)

1) Program start and input read

The entry point _start seeds the state and reads 32 bytes from stdin into a buffer in .bss:

0000000000401000 <_start>: 401000: xor r12,r12 401003: inc r12 401006: mov ebx,0x12345678 40100b: inc r12 40100e: xor r15,r15 401011: inc r12 401014: lea rsi,[rip+0x10f5] # 402110 <__bss_start> 40101b: inc r12 40101e: lea r14,[rip+0xfeb] # 402010 <expected> 401025: inc r12 401028: mov eax,0x0 40102d: inc r12 401030: mov edi,0x0 401035: inc r12 401038: mov edx,0x20 40103d: inc r12 401040: syscall

Important registers here:

rsi points to the input buffer (at virtual address 402110)
r14 points to the expected table (at virtual address 402010)
ebx is seeded with 0x12345678
r15 is the byte index and starts at 0

The syscall at 401040 is read(0, rsi, 0x20):

eax = 0 selects syscall number 0 on Linux (read)
edi = 0 selects file descriptor 0 (stdin)
rdx = 0x20 is the byte count (32)

After the read, the program clears the "how many indices matched" counter:

401045: xor r11,r11

2) Per-byte outer loop

The main loop runs i = r15 from 0 to 31:

000000000040104b <main_loop>: 40104b: cmp r15,0x20 40104f: jge 40113b <done>

Each iteration reads one input byte:

401058: movzx eax,BYTE PTR [rsi+r15*1]

Then it computes a loop counter r9 from the current global state rbx and the input byte:

401060: mov r8,rax 401066: xor r8,rbx 401072: and r8,0xff 40107c: add r8,0x5 401083: mov r9,r8

So r9 = ((rbx XOR byte) AND 0xff) + 5.

Also note this instruction:

401060: mov r13,r12

r13 snapshots r12 at the start of processing each byte.

The abnormal `inc r12` counter

One striking property of this binary is that it executes inc r12 extremely frequently. In the disassembly above you can already see it right after the function prologue and repeated after most operations.

This is not dead code. The value in r12 is used as part of the state mixing:

In path_a the code does xor rbx, r12
In path_b the code does add rbx, r12 and then multiplies by rbx
In the check it uses the current r12 value (mov rdx, r12) and subtracts the snapshot (sub rdx, r13)

So the effective transformation for byte index i depends on a rapidly changing counter that advances on almost every executed instruction. That is why the program feels "more procedural" than a normal hand-written checksum: r12 is acting like a time-like instruction counter that gets folded into the arithmetic.

3) Inner loop: mutating `rbx` based on the guessed byte

The inner loop runs while r9 != 0:

0000000000401089 <byte_loop>: 401089: test r9,r9 40108c: je 4010d7 <byte_done>

At each inner-loop iteration:

It computes a temporary value r10:

401091: mov r10,r9 401097: xor r10,rax 40109d: and r10,0x1

Here rax still holds the input byte.

It branches on r10:

4010a4: cmp r10,0x0 4010a8: jne 4010bf <path_b>

So:

if ((r9 XOR byte) AND 1) == 0 it takes path_a
otherwise it takes path_b

path_a:

00000000004010ad <path_a>: 4010ad: xor rbx,r12 4010b3: rol rbx,0x3

path_b:

00000000004010bf <path_b>: 4010bf: add rbx,r12 4010c5: imul rbx,rbx,0x7

After either path, it decrements r9 and repeats:

00000000004010cc <path_end>: 4010cc: dec r9 4010d2: jmp 401089 <byte_loop>

4) "Finish mixing" and the actual per-byte check

Once the inner loop ends, it runs the check code at byte_done:

00000000004010d7 <byte_done>: 4010d7: mov rdx,r12 4010dd: sub rdx,r13 4010e3: mov rcx,r15 4010e9: and rcx,0x7 4010f0: rol rdx,cl 4010f6: xor rbx,rdx 4010fc: rol rbx,0x5 401103: mov rax,rbx 401109: mov r8,QWORD PTR [r14+r15*8] 401110: xor rax,r8 401116: cmp rax,0x0 40111a: sete al 401120: movzx r10,al 401127: add r11,r10

What this means:

rdx = r12 - r13
it rotates rdx by (r15 & 7)
it applies rbx = ROL( rbX XOR rdx, 5 )
then it compares the resulting rbx to expected[r15]

The comparison is implemented as:

rax = rbx
r8 = expected[r15]
rax ^= r8
sete al sets al = 1 if rax == 0, else al = 0
r10 = zero_extend(al) gives r10 as 0 or 1
finally r11 += r10

So r10 is the per-byte success bit for the current index, and r11 counts how many byte indices matched so far.

After that, it increments the outer-loop index:

40112d: inc r15 401133: jmp 40104b <main_loop>

5) Final decision: all 32 bytes must match

When the outer loop ends:

000000000040113b <done>: 40113b: cmp r11,0x20 40113f: jne 401170 <fail>

So it prints success only if r11 == 32.

On success, it prints Correct!:

0000000000401144 <success>: 401144: mov eax,0x1 40114c: mov edi,0x1 401154: lea rsi,[rip+0xea5] # 402000 <success_msg> 40115e: mov edx,0x9 401166: syscall

On failure, it prints Wrong!:

0000000000401170 <fail>: 401180: lea rsi,[rip+0xe82] # 402009 <fail_msg> 40118a: mov edx,0x7 401192: syscall

The expected table

The expected values are stored in .data starting at 0x402010 and consist of 32 little-endian 64-bit values (qwords). Extracting the raw table from the binary yields:

expected[0] = 0xe1a3610df5d93da8 expected[1] = 0x3e232e23b5dc0997 expected[2] = 0x81513ba5e927a45 expected[3] = 0xa1936838fa7261de expected[4] = 0x5033cd38fb530b5 expected[5] = 0xe70ebb4a0fcea677 expected[6] = 0x6b187da7cdcc6df2 expected[7] = 0x2e797688f4a8a44d expected[8] = 0x1d24cde94345ff4c expected[9] = 0xa102f94e8cd3cd4a expected[10] = 0x6c60ea6e41852aff expected[11] = 0x3e818bf8f946bb78 expected[12] = 0xd6835df6aa72dc0b expected[13] = 0xd89e795c51f8cbef expected[14] = 0x50851bf3e0861d60 expected[15] = 0x4e4360593b6224b7 expected[16] = 0x764fba21d26da7e2 expected[17] = 0x4799e79c99c41ac7 expected[18] = 0x81d0af7c6b2613b4 expected[19] = 0x71274df9ca213b40 expected[20] = 0x6ba349848895752b expected[21] = 0xacce40e065cadc7a expected[22] = 0x68df30b65b850658 expected[23] = 0xf08dcaf63a96c712 expected[24] = 0x3015bcbbd307a0aa expected[25] = 0xb0d564fa30e3b4d5 expected[26] = 0xeb53d0b48a2ac18 expected[27] = 0x290b07cb7da9d12 expected[28] = 0xc41c86b53ce5886c expected[29] = 0xda740a01a04f1782 expected[30] = 0xaf7b5574fed2d5a0 expected[31] = 0x6e1f58ac94050c54

You do not need to reproduce these values to solve the challenge, but it helps confirm what is being compared.

How to solve it

Key observation

The binary never prints which byte was wrong. But in the check code it computes a 0/1 value and immediately adds it into r11:

40111a: sete al 401120: movzx r10,al 401127: add r11,r10

At the moment execution reaches address 0x401127 (the add r11,r10 instruction), register r10 is already set to:

1 if rbx == expected[r15]
0 otherwise

Also, at that same moment, $r15 is the current outer-loop index i.

So if we run the program under gdb and break exactly at 0x401127 with a condition if $r15 == i, then reading r10 tells us whether our current guess byte at index i is correct.

Why we can brute-force byte-by-byte

Even though rbx depends on all previous bytes, the program is deterministic. That means:

if you already fixed bytes 0..i-1 correctly
then for byte i you only need to search the remaining byte value so that the check at index i passes

Because the check at each index is immediately verifiable with the breakpoint trick above, you can build the flag sequentially from left to right.

Brute-force strategy

Fix bytes 0..i-1 to the values we already found.
For each candidate character c:
- construct an input buffer of length 32 where byte i = c and all other bytes are arbitrary filler
- run the binary under gdb (the breakpoint triggers and the gdb script immediately exits)
- break at 0x401127 only when $r15 == i
- read r10
The correct byte is the one where r10 == 1.
Repeat for i = 0..31.

The provided automated solver

The repository includes challenges/reven_WhosCounting/solve.py.

It implements exactly the strategy above, with two practical improvements:

It auto-locates the breakpoint address.
- In AT&T syntax, the check instruction is printed as add %r10,%r11.
- The script uses objdump -d and searches for that instruction to get the address.
It uses gdb in batch mode for speed and determinism.

For each candidate byte it writes input.txt and runs gdb with a script like:

set pagination off set confirm off file ./counting break *0x401127 if $r15 == 8 run < input.txt printf "R10=%lx\n", $r10 quit

It then parses the printed R10=... line and keeps the byte when it equals 1.

By default, the script brute-forces all 32 bytes starting from index 0.

If you already know a prefix of the flag (for example, many CTFs start with MetaCTF{), you can still use that knowledge to speed things up. The brute-force method works left-to-right because once bytes 0..i-1 are correct, byte i becomes independent to determine via the r10 success bit at 0x401127. So you can either:

start solving from byte index k instead of 0 (leave bytes 0..k-1 fixed to your known prefix), or
modify the script to pre-seed result with the known prefix bytes and then continue brute-forcing from k.

solve.py

#!/usr/bin/env python3import argparseimport osimport reimport stringimport subprocessimport sysfrom typing import List, OptionalRE_BREAK = re.compile(r"^\s*([0-9a-fA-F]+):.*add\s+%r10,%r11\b")def find_add_one_break_addr(binary_path: str) -> int: """ Locate the instruction that does `add %r10, %r11` in the disassembly. That is the spot where `r10` is 0/1 depending on whether the current byte check passes. """ out = subprocess.check_output(["objdump", "-d", binary_path], text=True, stderr=subprocess.DEVNULL) for line in out.splitlines(): m = RE_BREAK.match(line) if not m: continue return int(m.group(1), 16) raise RuntimeError("Could not locate `add %r10,%r11` in objdump output")def run_gdb_check( binary_path: str, break_addr: int, byte_index: int, candidate_byte: int, prefix_bytes: List[int], total_len: int, filler_byte: int,) -> Optional[int]: """ Returns 0/1 value from r10 when the conditional breakpoint triggers for $r15 == byte_index. If the breakpoint didn't trigger / parsing fails, returns None. """ buf = bytearray([filler_byte] * total_len) buf[:byte_index] = bytes(prefix_bytes) buf[byte_index] = candidate_byte # Binary reads exactly total_len bytes from stdin, so appending a newline is harmless. input_payload = bytes(buf) + b"\n" with open("input.txt", "wb") as f: f.write(input_payload) gdb_script = f"""set pagination offset confirm offfile {binary_path}break *0x{break_addr:x} if $r15 == {byte_index}run < input.txtprintf "R10=%lx\\n", $r10quit""" with open("gdb.txt", "w", encoding="utf-8") as f: f.write(gdb_script) res = subprocess.run( ["gdb", "-q", "--batch", "-x", "gdb.txt"], stdout=subprocess.PIPE, stderr=subprocess.STDOUT, text=True, ) for line in res.stdout.splitlines(): if "R10=" in line: # Example: `R10=0` or `R10=1` try: return int(line.split("R10=", 1)[1].strip(), 16) except ValueError: return None return Nonedef brute_force_flag( binary_path: str, total_len: int, charset: bytes, filler_byte: int,) -> bytes: break_addr = find_add_one_break_addr(binary_path) print(f"[+] Using breakpoint address: 0x{break_addr:x}") result: List[int] = [] for i in range(total_len): print(f"[+] Solving byte {i}...") prefix = result[:] # bytes for indices [0, i) for c in charset: v = run_gdb_check( binary_path=binary_path, break_addr=break_addr, byte_index=i, candidate_byte=c, prefix_bytes=prefix, total_len=total_len, filler_byte=filler_byte, ) if v == 1: result.append(c) print(f" -> Byte {i}: {chr(c)!r}") break else: raise RuntimeError(f"Failed to find a matching byte at index {i}") return bytes(result)def main() -> None: ap = argparse.ArgumentParser(description="Solve `reven_WhosCounting` using gdb breakpoint brute-force.") ap.add_argument("--len", dest="total_len", type=int, default=32, help="Number of bytes to read (default: 32)") ap.add_argument( "--charset", default=None, help="Charset to brute force (default: printable ASCII 0x20..0x7e).", ) ap.add_argument("--filler", default="A", help="Fill for unknown bytes beyond the current index (default: 'A').") args = ap.parse_args() # Run from the challenge directory so that gdb can read input.txt/gdb.txt. here = os.path.dirname(os.path.abspath(__file__)) os.chdir(here) binary_path = os.path.join(here, "counting") if not os.path.exists(binary_path): raise FileNotFoundError(f"Binary not found: {binary_path}") if args.charset is None: charset = bytes(range(0x20, 0x7F)) # printable ASCII else: charset = args.charset.encode("latin-1", errors="strict") if len(args.filler) != 1: raise ValueError("--filler must be exactly one character") filler_byte = ord(args.filler) flag = brute_force_flag( binary_path=binary_path, total_len=args.total_len, charset=charset, filler_byte=filler_byte, ) print("[+] Flag:", flag.decode("ascii"))if __name__ == "__main__": main()

Interested in joining our team? Let’s connect!

Let's go