Engineering • FAQ

How Ruthenium Handles Chunks.

Ruthenium is our Fabric fork. Its job here is simple: keep nearby activity grouped together, let separate areas run at the same time, and avoid two threads touching the same space in conflicting ways.

TL;DR

Ruthenium uses region-based parallelism. The world is partitioned into ownership domains, each active domain gets enough surrounding space to operate safely, and a ticking region is not allowed to expand while it is already running. If two regions become too close, they merge. If a region becomes sparse enough, it can be recalculated and split. This region model is similar to what is used by Folia .

What a region actually is

A region is the scheduler's ownership unit for world activity. It is a set of chunk space plus region-local state, and the most important rule is exclusivity: a live chunk holder belongs to one and only one live region. That gives the system a single authority over each active area.

Ruthenium also does not maintain this purely on a raw chunk level (that would be too slow). It groups chunks into larger internal region sections first, then builds regions from those sections. That reduces maintenance cost while still tracking locality well enough for scheduling decisions.

A region also owns region-local state: things like its own tick progress, the entities and chunks it is currently responsible for, and other per-region bookkeeping. That state is only supposed to be touched by the thread currently ticking that region.

The invariants that make this safe

The regionizer exists to preserve a few hard guarantees so multiple areas can run in parallel without any races.

  • Every live chunk holder maps to exactly one region.
  • A region owns enough nearby section space to satisfy its merge buffer.
  • A ticking region cannot claim new outward territory mid-tick.
  • Regions that end up adjacent cannot keep ticking side by side forever; they eventually merge.
  • A region that becomes disconnected should eventually be split back into independent pieces.
  • Region state is explicit: ready, ticking, transient, or dead.

The second and third guarantees are the main sections of the design. They are what let a ticking region touch nearby chunk data without discovering that another thread claimed the same boundary space at the same time.

Why the merge radius exists

If two active areas are close enough that either one could load or interact with the other's nearby chunk space, they are not truly independent and will slow each other down or cause issues. We solve that by maintaining a merge radius around owned region sections.

In other words, a region owns not only the section space it is actively using, but also enough surrounding space to prevent another region from forming directly on its edge. That buffer is what makes synchronous nearby chunk work safe.

Another way to say it: a region should never start ticking with a directly adjacent active neighbor. If two active areas get close enough that they can no longer be treated as isolated, the system stops pretending they are independent and merges them.

What happens while a region is ticking

Once a region enters the ticking state, its owned area is frozen for that tick. That rule prevents two active regions from racing toward the same unowned boundary and claiming it concurrently.

If new work appears that would require an expansion or merge, we record that relationship, but do not allow the currently ticking region to just grow outward on the spot. Ownership changes happen at controlled handoff points, not in the middle of arbitrary work.

Transient regions and deferred merges

When a merge is needed and one side is already ticking, we will defer the full merge. The non-ticking side can become transient relative to the ticking side, and the actual merge is completed after the ticking region finishes.

This is a key part of the design. It preserves ownership correctness without violating the no-expansion rule for active regions. The scheduler can acknowledge that a merge is now required without introducing a race by performing it immediately.

How splitting works

Regions are not permanent partitions. As players move away and chunk activity dies down, a large region can become mostly empty or disconnected. We track dead or empty region sections and uses recalculation thresholds to decide when it is worth checking whether the region can be split.

If enough dead space accumulates, the region is recalculated for independent subgraphs and can be broken back into smaller regions. This keeps the scheduler from staying over-merged after the workload has dispersed.

When that happens, the child regions inherit the parent's local tick counters so scheduled work does not suddenly shift in relative timing just because the shape of the region changed.

Where the scheduler fits in

Regionizing decides what can safely run in parallel. The scheduler decides which runnable region goes to which tick runner and when. Those are separate problems. You can have a good region model and still leave performance on the table if your scheduler is weak.

In simple terms, the scheduler operates on an EDF-style model. Runnable region tasks are ordered by deadline, and the soonest one gets priority. On top of that, it tries to preserve locality, reduce idle time before ticks, and rebalance overdue work when one runner falls behind.

The important detail is that regions tick independently. One region missing its 20 TPS target does not automatically drag another region off schedule, because each region tracks its own next start time instead of waiting on a single shared world tick loop.

How the scheduler works

EDF means Earliest Deadline First. Every runnable region task has a tick deadline, and the task with the earliest deadline gets priority. When a runner finishes its current work, it looks for the next runnable region with the soonest deadline and takes that task.

Left completely alone, this kind of scheduler is simple and predictable, but it can also bounce regions between runners too freely. That is good for hitting deadlines, but not always ideal for cache warmth or per-region continuity.

Internally, the scheduler uses ordered queues protected by a scheduler lock. Queue operations such as insert, poll, and requeue go through that lock so selection stays atomic and race-free. Ordering is deadline-first, with a unique task id used as a tie-breaker.

The result is still deadline-driven first, but not naive. The scheduler tries to keep urgency, locality, and runner utilization in balance instead of optimizing for only one of those things.

Folia describes this as an earliest-start-time-first schedule that ends up behaving like EDF for ticking, because each start time implies a deadline 50ms later. The goal is simple: if a region can usually finish inside that 50ms budget and the pool is not saturated, it should still hold 20 TPS.

  • Mid-tick work lets a runner execute queued region tasks while waiting for the main tick deadline.
  • Work stealing lets overdue work migrate when one runner falls behind badly enough.

Tick counters and the global region

Region threading also changes what a "tick counter" means. In vanilla there is effectively one current tick and one world game time. In a regionized model, those counters are split so local scheduling stays local.

Current tick and redstone time are maintained per region, while global game time and daylight time live in a separate global region. At the start of each region tick, the region takes a snapshot of the global values so code running mid-tick sees a stable view instead of time changing underneath it.

That global region is also the owner for state that is not naturally tied to one local area: weather, game rules, world border behavior, console commands, and similar world-global tasks. Unlike normal regions, it is never split or merged.

Merges need a little bookkeeping here. If one region is merged into another, absolute deadlines tied to local current tick or redstone tick have to be offset so their relative timing stays the same after the merge.

Mid-tick work and buffered deadlines

One of the more interesting parts of the scheduler is that a runner does not always sit idle while waiting for the formal start of a region tick. If there is safe queued work available, it can run that work until a buffered deadline says it is time to stop and prepare for the real tick.

The idea is to shrink backlog without letting that side work spill into the main region tick budget. The scheduler uses the time before the tick productively, but not recklessly.

Work Stealing

Work stealing does not mean random task migration. The scheduler still tries to keep regions on the same runner when that is working. Stealing is the fallback when a runner is late enough that catching up matters more than keeping that region local.

With stealing, each runner effectively has its own local queue, plus access to a global scheduling view. Selection prefers the best local or global deadline candidate first, then checks whether another runner's head task has become stealable. That gives the scheduler a way to rebalance overload without constantly destroying locality.

Cross-region work

The easy case is work that stays inside one region. The harder case is anything that needs to touch another region that may already be ticking, may be transient, or may not even exist yet. The safe way to do that is not direct access, but queued handoff.

That same issue is what makes more complicated actions like teleportation thread-safe. A simple teleport is really two stages: remove and transform on the source region, then asynchronously place on the destination region. Portal travel adds a search-or-create stage in the middle because the final exit location is not known up front.

Shutdown and unfinished teleports

Shutdown has to respect that some operations may be currently happening. Teleports are the clearest example. If the server stops while an entity is between regions, the shutdown path has to finish that handoff in a controlled way before world and player data are saved.

That is why the order must differ from vanilla. The region scheduler is stopped first, chunk systems are halted before saving, and pending teleports are resolved before player data is written. If a teleport is in progress, the safest move is to finish placing the entity at the known destination. If a portal teleport is still missing its final exit position, the fallback is to put the entity back at the source side rather than guess against a half-shutdown world.

A couple vanilla quirks

Some odd mechanics matter enough to preserve even in a region-threaded world. One example is the detached fishing-hook behavior used in certain wireless-style contraptions. When a player changes dimension, the hook can stay alive in its own area, remember who owns it, and wait until the right region is available again before finishing the interaction.

In practice, that means the hook does not instantly break just because its owner or target is being handled elsewhere. The server keeps a thread-safe snapshot of the state it needs, avoids touching off-thread entities directly, and then schedules the final reel-in or cleanup on the correct region thread once that work is safe to realize.

Another example is redirectable projectiles, such as when an arrow hits something that can be deflected. In a single-threaded server that redirection can read the owner's look direction directly. In a region-threaded server, that same idea still works by keeping a thread-safe copy of the needed rotation data, so the deflection can be computed correctly even if the owner is not on the same active region thread at that exact moment.

← Back to FAQ