Advent of Code 2025
Advent of Code is an annual event where people solve Santa-themed programming problems in December.
Here, I’ll write about my solutions for the 2025 event. It’s shorter than past ones, at just 12 days. The twist in my version is that I’ll try to squeeze the pointless microseconds off my solutions’ running times.
In Julia, to stay high-level, yet have compiled speeds.
In the grand tradition of these writeups, I will not include either the question, or my input. You are invited to try solving on your own first.
Day 1
I want to measure my solution’s speed, not the
String→Int64 performance, so let’s start by
importing the measurument tools, and parsing the input. I’ll use
positive and negative ints for R and L
rotations respectively.
using BenchmarkTools
scan_val(s) = (s[1] == 'L' ? -1 : 1) * parse(Int64, s[2:end])
lines = readlines(stdin) .|> scan_valMy first solution to part 1 is what you’d expect:
using IterTools
positions(xs) = accumulate(+, chain([ 50 ], xs))
fits(x) = x % 100 == 0
part1_v1(r) = r |> positions .|> fits |> sumTo measure the run time, I use BenchmarkTools.jl:
display(@benchmark part1_v1($lines))BenchmarkTools.Trial: 10000 samples with 7 evaluations per sample.
Range (min … max): 3.974 μs … 293.885 μs ┊ GC (min … max): 0.00% … 95.64%
Time (median): 4.490 μs ┊ GC (median): 0.00%
Time (mean ± σ): 5.316 μs ± 5.461 μs ┊ GC (mean ± σ): 10.55% ± 11.21%
Memory estimate: 33.45 KiB, allocs estimate: 6.
Note the allocations, and the 4 µs time. We are creating intermediate arrays. This could be converted to iterators, or we can use a plain loop.
function part1_v2(rs)
cur, p1 = 50, 0
for r ∈ rs
cur += r
p1 += fits(cur)
end
return p1
endDid it improve things?
BenchmarkTools.Trial: 10000 samples with 9 evaluations per sample.
Range (min … max): 2.862 μs … 4.351 μs ┊ GC (min … max): 0.00% … 0.00%
Time (median): 2.878 μs ┊ GC (median): 0.00%
Time (mean ± σ): 2.890 μs ± 62.957 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
Memory estimate: 0 bytes, allocs estimate: 0.
Indeed.
I don’t think much else can be done for part 1: we do need to look at each prefix sum, and do a check for each point. If there was a guarantee that each rotation is less than a full circle, we could convers the modulo division to a comparison and subtraction, but there is no such guarantee, and it’s not true for my input.
For part 2, let’s start with something simple:
moves(ps) = zip(ps[1:end-1], ps[2:end]) .|> sort
zeros_between_v1((x, y)) = x:y .|> fits |> sum
part2_v1(r) = (r |> positions |> moves .|> zeros_between_v1 |> sum) - part1_v1(r) + fits(sum(r))zeros_between_v1 checks every number in a closed
interval [x, y]. So
every “stop” is counted twice; to counter that, we subtract the result
of part 1. But the very last point was only counted once in both, so we
need to add it back in, if it fits. Off-by-one is cursed.
Benchmark!
BenchmarkTools.Trial: 9013 samples with 1 evaluation per sample.
Range (min … max): 481.182 μs … 3.512 ms ┊ GC (min … max): 0.00% … 77.50%
Time (median): 495.684 μs ┊ GC (median): 0.00%
Time (mean ± σ): 553.220 μs ± 221.447 μs ┊ GC (mean ± σ): 8.20% ± 13.44%
Memory estimate: 618.52 KiB, allocs estimate: 4242.
I love a high baseline.
Checking every number is silly; we can do integer division to decide how many circles we’ve done.
zeros_between((x, y)) = fld(y, 100) - fld(x, 100) + fits(x)
part2_v2(r) = (r |> positions |> moves .|> zeros_between |> sum) - part1_v2(r) + fits(sum(r))Note how we need to check the left edge separately; consider [200, 250]: fld would return 2
for both edges.
Did that help?
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
Range (min … max): 32.967 μs … 1.131 ms ┊ GC (min … max): 0.00% … 92.84%
Time (median): 36.134 μs ┊ GC (median): 0.00%
Time (mean ± σ): 44.375 μs ± 49.280 μs ┊ GC (mean ± σ): 14.41% ± 11.87%
Memory estimate: 321.41 KiB, allocs estimate: 34.
That’s better. How far can we go?
Let’s again try and control allocations by going iterative.
function part2_v3(rs)
p2 = p1 = 0
step, last, next = 0, 50, 50
for step ∈ rs
next = last + step
p2 += step>0 ? zeros_between((last, next)) : zeros_between((next, last))
p1 += fits(last)
last = next
end
p1 += fits(last)
return p2 - p1
endOur zeros_between expects a sorted pair. Note that I’ve
inlined the part 1 calculation.
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
Range (min … max): 14.754 μs … 65.051 μs ┊ GC (min … max): 0.00% … 0.00%
Time (median): 14.835 μs ┊ GC (median): 0.00%
Time (mean ± σ): 14.905 μs ± 679.352 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
Memory estimate: 0 bytes, allocs estimate: 0.
More than 2x speedup.
Can we go further?
I’m so glad you asked. Euclidean division is expensive; the integer
division computers do (“closest to zero”) is faster, but will only work
for us if all positions are positive. Let’s make them so. Also, cache
results of integer division as much as possible: each edge is
effectively processed twice, as we run zeros_between twice
for each end. Let’s inline it.
function part2_v4(rs)
p2 = 0
step, last, next = 0, 10000050, 10000050 # "ensure" no negative numbers so div == fld
divlast = divnext = 10000050 ÷ 100
fitslast = fitsnext = 0
for step ∈ rs
next = last + step
divnext = next ÷ 100
fitsnext = Int64(next == 100 * divnext)
# let's group both counts to avoid extra assignments
if signbit(step)
p2 += divlast - divnext + fitsnext - fitslast
else
p2 += divnext - divlast
end
last, divlast, fitslast = next, divnext, fitsnext
end
return p2
endI’ve also inlined “part 1” calculation further.
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
Range (min … max): 8.180 μs … 83.509 μs ┊ GC (min … max): 0.00% … 0.00%
Time (median): 8.239 μs ┊ GC (median): 0.00%
Time (mean ± σ): 8.288 μs ± 825.242 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
Memory estimate: 0 bytes, allocs estimate: 0.
For the last step, let’s chunk together same-direction rotations.
function part2_v5(rs)
p2 = p1 = 0
step, last = rs[1], 1000050
divlast = 10000
fitslast = 0
for r ∈ Iterators.rest(rs, 2) # rs[2:end] would allocate!
if signbit(r) == signbit(step) # step and r never zero, so this is equivalent to sign()
step += r
continue
end
# don’t want to keep extra vars, will update in place instead. extra assignment to p2, but -3 assignments to *next.
if signbit(step)
p2 += divlast - fitslast
last += step
divlast = last ÷ 100
fitslast = Int64(last == 100 * divlast)
p2 += -divlast + fitslast
else
p2 -= divlast
last += step
divlast = last ÷ 100
fitslast = Int64(last == 100 * divlast)
p2 += divlast
end
step = r
end
if signbit(step)
p2 += divlast - fitslast
last += step
divlast = last ÷ 100
fitslast = Int64(last == 100 * divlast)
p2 += -divlast + fitslast
else
p2 -= divlast
last += step
divlast = last ÷ 100
fitslast = Int64(last == 100 * divlast)
p2 += divlast
end
return p2
endGiving
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
Range (min … max): 5.198 μs … 22.137 μs ┊ GC (min … max): 0.00% … 0.00%
Time (median): 5.238 μs ┊ GC (median): 0.00%
Time (mean ± σ): 5.278 μs ± 378.826 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
Memory estimate: 0 bytes, allocs estimate: 0.
Not bad.