Advent of Code 2025
Advent of Code is an annual event where people solve Santa-themed programming problems in December.
Here, I’ll write about my solutions for the 2025 event. It’s shorter than past ones, at just 12 days. The twist in my version is that I’ll try to squeeze the pointless microseconds off my solutions’ running times.
In Julia, to stay high-level, yet have compiled speeds.
In the grand tradition of these writeups, I will not include either the question, or my input. You are invited to try solving on your own first.
Day 1
I want to measure my solution’s speed, not the
String→Int64 performance, so let’s start by
importing the measurument tools, and parsing the input. I’ll use
positive and negative ints for R and L
rotations respectively.
using BenchmarkTools
scan_val(s) = (s[1] == 'L' ? -1 : 1) * parse(Int64, s[2:end])
lines = readlines(stdin) .|> scan_valMy first solution to part 1 is what you’d expect:
using IterTools
positions(xs) = accumulate(+, chain([ 50 ], xs))
fits(x) = x % 100 == 0
part1_v1(r) = r |> positions .|> fits |> sumTo measure the run time, I use BenchmarkTools.jl:
display(@benchmark part1_v1($lines))BenchmarkTools.Trial: 10000 samples with 7 evaluations per sample.
Range (min … max): 3.974 μs … 293.885 μs ┊ GC (min … max): 0.00% … 95.64%
Time (median): 4.490 μs ┊ GC (median): 0.00%
Time (mean ± σ): 5.316 μs ± 5.461 μs ┊ GC (mean ± σ): 10.55% ± 11.21%
██▆▃▂▁ ▁ ▂
████████▇▆▆▆▅▅██▆▆▅▄▅▃▁▃▁▁▁▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▅██▇▆▆▅▅ █
3.97 μs Histogram: log(frequency) by time 30.2 μs <
Memory estimate: 33.45 KiB, allocs estimate: 6.
Note the allocations, and the 4 µs time. We are creating intermediate arrays. This could be converted to iterators, or we can use a plain loop.
function part1_v2(rs)
cur, p1 = 50, 0
for r ∈ rs
cur += r
p1 += fits(cur)
end
return p1
endDid it improve things?
BenchmarkTools.Trial: 10000 samples with 9 evaluations per sample.
Range (min … max): 2.862 μs … 4.351 μs ┊ GC (min … max): 0.00% … 0.00%
Time (median): 2.878 μs ┊ GC (median): 0.00%
Time (mean ± σ): 2.890 μs ± 62.957 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▁▆█▇▄ ▁ ▁ ▂
█████▇▃▁▁▁▁▃▁▁▁▁▁▁▁▁▃▁▁▃▇███▇▇▇▄▄▅▃▄▄▄▄▄▃▁▄▃▁▃▄▃▁▄▁▁▃▁▄▆██ █
2.86 μs Histogram: log(frequency) by time 3.19 μs <
Memory estimate: 0 bytes, allocs estimate: 0.
Indeed.
I don’t think much else can be done for part 1: we do need to look at each prefix sum, and do a check for each point. If there was a guarantee that each rotation is less than a full circle, we could convers the modulo division to a comparison and subtraction, but there is no such guarantee, and it’s not true for my input.
For part 2, let’s start with something simple:
moves(ps) = zip(ps[1:end-1], ps[2:end]) .|> sort
zeros_between_v1((x, y)) = x:y .|> fits |> sum
part2_v1(r) = (r |> positions |> moves .|> zeros_between_v1 |> sum) - part1_v1(r) + fits(sum(r))zeros_between_v1 checks every number in a closed
interval [x, y]. So
every “stop” is counted twice; to counter that, we subtract the result
of part 1. But the very last point was only counted once in both, so we
need to add it back in, if it fits. Off-by-one is cursed.
Benchmark!
BenchmarkTools.Trial: 9013 samples with 1 evaluation per sample.
Range (min … max): 481.182 μs … 3.512 ms ┊ GC (min … max): 0.00% … 77.50%
Time (median): 495.684 μs ┊ GC (median): 0.00%
Time (mean ± σ): 553.220 μs ± 221.447 μs ┊ GC (mean ± σ): 8.20% ± 13.44%
█▆▅▄▃▂ ▁
████████▇▇▇▆▅▅▁▄▁▁▁▁▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▅▆▆▆▇▇▇▇██▇▆▇▇▆▇▆▆▆ █
481 μs Histogram: log(frequency) by time 1.54 ms <
Memory estimate: 618.52 KiB, allocs estimate: 4242.
I love a high baseline.
Checking every number is silly; we can do integer division to decide how many circles we’ve done.
zeros_between((x, y)) = fld(y, 100) - fld(x, 100) + fits(x)
part2_v2(r) = (r |> positions |> moves .|> zeros_between |> sum) - part1_v2(r) + fits(sum(r))Note how we need to check the left edge separately; consider [200, 250]: fld would return 2
for both edges.
Did that help?
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
Range (min … max): 32.967 μs … 1.131 ms ┊ GC (min … max): 0.00% … 92.84%
Time (median): 36.134 μs ┊ GC (median): 0.00%
Time (mean ± σ): 44.375 μs ± 49.280 μs ┊ GC (mean ± σ): 14.41% ± 11.87%
█▅▂▁ ▁
█████▆▆▁▄▄▇█▆▄▃▃▄▃▁▁▃▁▁▁▁▄▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▁▁▁▁▁▁▁▄▅▆██ █
33 μs Histogram: log(frequency) by time 351 μs <
Memory estimate: 321.41 KiB, allocs estimate: 34.
That’s better. How far can we go?
Let’s again try and control allocations by going iterative.
function part2_v3(rs)
p2 = p1 = 0
step, last, next = 0, 50, 50
for step ∈ rs
next = last + step
p2 += step>0 ? zeros_between((last, next)) : zeros_between((next, last))
p1 += fits(last)
last = next
end
p1 += fits(last)
return p2 - p1
endOur zeros_between expects a sorted pair. Note that I’ve
inlined the part 1 calculation.
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
Range (min … max): 14.754 μs … 65.051 μs ┊ GC (min … max): 0.00% … 0.00%
Time (median): 14.835 μs ┊ GC (median): 0.00%
Time (mean ± σ): 14.905 μs ± 679.352 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▂▆█▇▄ ▂
█████▅▁▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▅▆▇▇▆▆▆▅▆▅▅▅▇█▇▅▄▄ █
14.8 μs Histogram: log(frequency) by time 16.5 μs <
Memory estimate: 0 bytes, allocs estimate: 0.
More than 2x speedup.
Can we go further?
I’m so glad you asked. Euclidean division is expensive; the integer
division computers do (“closest to zero”) is faster, but will only work
for us if all positions are positive. Let’s make them so. Also, cache
results of integer division as much as possible: each edge is
effectively processed twice, as we run zeros_between twice
for each end. Let’s inline it.
function part2_v4(rs)
p2 = 0
step, last, next = 0, 10000050, 10000050 # "ensure" no negative numbers so div == fld
divlast = divnext = 10000050 ÷ 100
fitslast = fitsnext = 0
for step ∈ rs
next = last + step
divnext = next ÷ 100
fitsnext = Int64(next == 100 * divnext)
# let's group both counts to avoid extra assignments
if signbit(step)
p2 += divlast - divnext + fitsnext - fitslast
else
p2 += divnext - divlast
end
last, divlast, fitslast = next, divnext, fitsnext
end
return p2
endI’ve also inlined “part 1” calculation further.
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
Range (min … max): 8.180 μs … 83.509 μs ┊ GC (min … max): 0.00% … 0.00%
Time (median): 8.239 μs ┊ GC (median): 0.00%
Time (mean ± σ): 8.288 μs ± 825.242 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▄▇█▆▁ ▂
█████▃▁▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅██▅▁▁▁▁▁▁▁▁▁▁▁▄▅▅▆▆▇▅▅▃▅ █
8.18 μs Histogram: log(frequency) by time 9.69 μs <
Memory estimate: 0 bytes, allocs estimate: 0.
For the last step, let’s chunk together same-direction rotations.
function part2_v5(rs)
p2 = p1 = 0
step, last = rs[1], 1000050
divlast = 10000
fitslast = 0
for r ∈ Iterators.rest(rs, 2) # rs[2:end] would allocate!
if signbit(r) == signbit(step) # step and r never zero, so this is equivalent to sign()
step += r
continue
end
# don’t want to keep extra vars, will update in place instead. extra assignment to p2, but -3 assignments to *next.
if signbit(step)
p2 += divlast - fitslast
last += step
divlast = last ÷ 100
fitslast = Int64(last == 100 * divlast)
p2 += -divlast + fitslast
else
p2 -= divlast
last += step
divlast = last ÷ 100
fitslast = Int64(last == 100 * divlast)
p2 += divlast
end
step = r
end
if signbit(step)
p2 += divlast - fitslast
last += step
divlast = last ÷ 100
fitslast = Int64(last == 100 * divlast)
p2 += -divlast + fitslast
else
p2 -= divlast
last += step
divlast = last ÷ 100
fitslast = Int64(last == 100 * divlast)
p2 += divlast
end
return p2
endGiving
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
Range (min … max): 5.198 μs … 22.137 μs ┊ GC (min … max): 0.00% … 0.00%
Time (median): 5.238 μs ┊ GC (median): 0.00%
Time (mean ± σ): 5.278 μs ± 378.826 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
▃▇█▇▆▄▄▄▂ ▂
██████████▇▇█▆▆▇▅▄▅▅▅▅▄▄▄▄▁▃▅▃▃▄▃▃▁▁▁▄▁▁▁▁▁▁▁▁▃▁▁▁▃▁▁▄▇███▆ █
5.2 μs Histogram: log(frequency) by time 5.99 μs <
Memory estimate: 0 bytes, allocs estimate: 0.
Day 2
Let’s start with data preparation:
inp = "<puzzle input>"
stringed_intervals = split(inp, ",") .|> (x->split(x, "-")) .|> x->(x[1], x[2])
numeric_intervals = stringed_intervals .|> x->(x .|> x->parse(Int, x))The problem is at an intersection, where stringy representations are useful (we’re dealing with properties of the decimal representation), but numeric ones also can be (it is a numeric property).
Let’s start with a “regex” approach:
re1 = r"^(.+)\1$"
re2 = r"^(.+)\1+$"
match_in_interval(re, (x, y)) = x:y .|> string .|> (s->occursin(re, s) ? parse(Int, s) : 0) |> sum
match_in_interval(re) = (xs...) -> match_in_interval(re, xs...)
p1 = numeric_intervals .|> match_in_interval(re1) |> sum
p2 = numeric_intervals .|> match_in_interval(re2) |> sum
display((p1, p2))Note the quotes around “regex”: this is a form PCRE supports, but these are not regular expressions, and cannot be compiled to FSMs for efficient execution. We are also checking every number in the sequence.
Benchmarks:
BenchmarkTools.Trial: 24 samples with 1 evaluation per sample.
Range (min … max): 204.602 ms … 222.144 ms ┊ GC (min … max): 5.23% … 3.80%
Time (median): 210.692 ms ┊ GC (median): 5.25%
Time (mean ± σ): 210.891 ms ± 4.028 ms ┊ GC (mean ± σ): 5.71% ± 1.33%
Memory estimate: 133.89 MiB, allocs estimate: 3899771.
BenchmarkTools.Trial: 24 samples with 1 evaluation per sample.
Range (min … max): 207.480 ms … 237.776 ms ┊ GC (min … max): 4.29% … 6.68%
Time (median): 215.786 ms ┊ GC (median): 6.20%
Time (mean ± σ): 215.849 ms ± 5.896 ms ┊ GC (mean ± σ): 5.61% ± 1.35%
Memory estimate: 133.89 MiB, allocs estimate: 3899771.
About as bad as you’d expect.
Let’s do some more preprocessing on our source data:
using DataStructures
function preprocessed_numeric_intervals(n_i)
out = Stack{Tuple{Int64, Int64}}()
for (x, y) ∈ n_i
digits_x, digits_y = trunc(Int64, log10(x)), trunc(Int64, log10(y))
while digits_x < digits_y
push!(out, (x, 10^(digits_x+1)-1))
x = 10^(digits_x+1)
digits_x += 1
end
push!(out, (x, y))
end
return out
end
prepr_int = numeric_intervals |> preprocessed_numeric_intervalsThis gives us a stack of intervals where start and end are equally long. This makes things easier. The timing here is sub-µs.
Now for part 1. If we have an interval of, say, 8-digit numbers, all those that “fit” will be divisible by 10 001. So we can just look at an arithmetic progression, and at the part of it within our interval.
function part1_v1(intervals)
tot = 0
for (x, y) ∈ intervals
digits = trunc(Int64, log10(x)) + 1
if digits % 2 == 1
continue
end
divisor = 10^(digits ÷ 2) + 1
min_fits = x ÷ divisor + 1
max_fits = y ÷ divisor
if max_fits >= min_fits
tot += divisor * trunc(Int64, (min_fits + max_fits) / 2 * (max_fits - min_fits + 1))
end
end
return tot
endBenchmark:
BenchmarkTools.Trial: 10000 samples with 88 evaluations per sample.
Range (min … max): 810.341 ns … 1.328 μs ┊ GC (min … max): 0.00% … 0.00%
Time (median): 812.273 ns ┊ GC (median): 0.00%
Time (mean ± σ): 814.401 ns ± 9.249 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
Yeah.
For part 2, we’ll need to generalize this to take the length of the repeated substring:
function failed_for_length(substr_length, (x, y))
if substr_length == 0 return 0 end
digits = trunc(Int64, log10(x)) + 1
if digits % substr_length != 0 return 0 end
divisor = sum(10^i for i ∈ 0:substr_length:(digits-substr_length))
min_fits, max_fits = x ÷ divisor+1, y ÷ divisor
if max_fits < min_fits return 0 end
return divisor * (min_fits + max_fits) * (max_fits - min_fits + 1) ÷ 2
endReimplementing part 1 with this
function part1_v2(intervals)
tot = 0
for (x, y) ∈ intervals
digits = trunc(Int64, log10(x)) + 1
tot += failed_for_length(digits ÷ 2, (x, y))
end
return tot
endcauses a slowdown (largely due to extra integer division), but we’ve only done this to check.
So for part 2 we want to check all possible substring length, right?
Except: if we are looking at 8-digit numbers, number 12121212 will be
included both in failed_for_length(2,...), and
failed_for_length(4,...). So we have to be smart. Full
proof is left as an exercise to the reader, but:
If you have a divisor di of total length D, and divisor dj, and di divides dj, and C(x) is all ids formed by repeating substring of length x, then C(di) ∈ C(dj).
So we need to:
- Find all di < D such that di|D
- for each di, calculate Si = ∑x ∈ C(di)x
- for each di, find Ni = |j : di|dj ∧ i ≠ j|
- result is ∑iSi(1 − Ni)
I fell that we can just pre-compute all di and 1 − Ni for the lengths we have in our input. We have lengths up to 10.
function part2_v2(intervals)
# pre-computed divisors and coefficients.
divs = [
[ ],
[ (1, 1) ],
[ (1, 1) ],
[ (2, 1) ],
[ (1, 1) ],
[ (1, -1), (2, 1), (3, 1), ],
[ (1, 1) ],
[ (4, 1) ],
[ (3, 1) ],
[ (1, -1), (2, 1), (5, 1), ],
]
tot = 0
for (x, y) ∈ intervals
D = trunc(Int64, log10(x)) + 1
tot += sum(
(n * failed_for_length(d, (x, y))
for (d, n) ∈ divs[D]
if d != 0);
init=0)
end
return tot
endGiving a benchmark of
BenchmarkTools.Trial: 10000 samples with 6 evaluations per sample.
Range (min … max): 5.359 μs … 1.844 ms ┊ GC (min … max): 0.00% … 99.19%
Time (median): 6.152 μs ┊ GC (median): 0.00%
Time (mean ± σ): 6.841 μs ± 21.484 μs ┊ GC (mean ± σ): 4.29% ± 1.40%
Memory estimate: 6.23 KiB, allocs estimate: 230.
We can get some more speedup out of this. Note, for example, that we
are computing digits both in part2_v2 and in
failed_for_length. We can just pass in what was calculated,
and cut out a log10. Exponentiation may also be
problematic, so let’s pre-compute powers of 10. Finally, let’s make the
divs array static, so we don’t allocate.
using StaticArrays
function failed_for_length_v2(substr_length, digits, (x, y))
powers_10 = SA[ 1, 10, 100, 1000, 10_000, 100_000, 1_000_000, 10_000_000, 100_000_000, 1_000_000_000, 10_000_000_000 ]
divisor = sum(view(powers_10, 1:substr_length:(1+digits-substr_length)))
min_fits, max_fits = x ÷ divisor+1, y ÷ divisor
if max_fits < min_fits return 0 end
return divisor * (min_fits + max_fits) * (max_fits - min_fits + 1) ÷ 2
end
function part2_v4(intervals)
# pre-computed divisors and coefficients.
divs = SA[
SA[ (0,0), (0,0), (0,0), ],
SA[ (1, 1), (0,0), (0,0), ],
SA[ (1, 1), (0,0), (0,0), ],
SA[ (2, 1), (0,0), (0,0), ],
SA[ (1, 1), (0,0), (0,0), ],
SA[ (1, -1), (2, 1), (3, 1), ],
SA[ (1, 1), (0,0), (0,0), ],
SA[ (4, 1), (0,0), (0,0), ],
SA[ (3, 1), (0,0), (0,0), ],
SA[ (1, -1), (2, 1), (5, 1), ],
]
tot = 0
for (x, y) ∈ intervals
D = trunc(Int64, log10(x)) + 1
tot += sum(
(n * failed_for_length_v2(d, D, (x, y))
for (d, n) ∈ divs[D]
if d != 0);
init=0)
end
return tot
endBenchmark:
BenchmarkTools.Trial: 10000 samples with 10 evaluations per sample.
Range (min … max): 1.788 μs … 4.093 μs ┊ GC (min … max): 0.00% … 0.00%
Time (median): 2.035 μs ┊ GC (median): 0.00%
Time (mean ± σ): 2.046 μs ± 108.202 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
Memory estimate: 0 bytes, allocs estimate: 0.
And that’s all that comes to mind.