The momentum of building a product, shaping abstract ideas into something concrete, comes with compromises and limitations. We quickly connect components and take shortcuts to speed up deliveries. This sets the scene for unpredictability.
Even a prototype is composed of several interacting components with growing complexity. That complexity leads to emergent behaviour:
A system-level property that arises from the interaction of simpler components and is not defined by any single component.
Emergent behaviour can be neutral or positive. Here, I’m specifically interested in emergent failure modes – undefined or hard to enforce invariants, scalability issues, and unexpected runtime errors.
Architectural pressure is the force cracking the design in a specific place leading to failure modes. By recognising such pressure, we feel the drive to evolve the architecture.
It signals that requirements have outgrown current design and highlights the necessary next steps. We discover these cracks through incidents or by deliberately pushing on areas that seem sensitive.
This essay uses the initial design of a web crawler to demonstrate emergent failure modes as a concrete phenomenon, and how recognising architectural pressure can significantly expand capabilities.
Part 1. Emergent Behaviour
Meet the Crawler
I need a crawler to search for signals through changes in specific web pages. For example, new trends in software engineering.
The first version relied on an Orchestrator to drive the entire flow: initiates crawling through a seed URL, fetches the web page, process the page URLs, and only enqueue supported, normalised, unseen URLs.

Pages are processed concurrently. Each crawled page is processed sequentially as described in the diagram above.
The process function implements the pipeline above.
▶ Orchestrator#process (click to expand for simplified Scala code)
private def process( queue: Queue[F, Uri], inflightWork: Ref[F, Long])(url: Uri): F[Unit] = // enqueue URL and track in-flight work def enqueue(url: Uri) = inflightWork.update(_ + 1) *> queue.offer(url) (for // fetch webpage htmlPage <- fetcher.fetch(url) // extract urls from the page pageLinks <- linkExtractor.extract(htmlPage) // emit all page urls to stdout _ <- emitter.emit(pageLinks) acceptedLinks = pageLinks.links.filter(urlFilter.accept) // canonicalise urls canonicalisedLinks = acceptedLinks.map(UrlNormaliser.canonicalise) // deduplicate urls deduplicatedLinks <- canonicalisedLinks.toVector.filterA(link => deduplicator.hasSeen(link).map(!_) ) // add all valid newly discovered urls to the queue _ <- deduplicatedLinks.traverse_( enqueue ) yield ()) .guarantee { // decrease in-flight work counter inflightWork.update(_ - 1) }
There are 3 fundamental components powering the crawler:
- A bounded URL queue preventing OOM
- Fibers supporting cooperative multi-tasking
- Orchestrator defining the data flow and key invariants such as termination.
Emergent Design Pressure
All components work as expected locally:
- The queue blocks operations when full
- Fibers enable concurrent page processing without blocking threads
- Orchestrator composes the dataflow and makes sure termination is correct.
Observe the failure mode:
Orchestratordiscoveries many URLs for each URL it processes- The bounded queue reaches capacity
- Each
Orchestratorconcurrent task finishes processing the page - Since the queue is full, in-flight tasks are suspended when enqueuing newly discovered URLs
- Since in-flight
Orchestratortasks are suspended, they cannot dequeue URLs - Termination depends on completing all in-flight work
- Result: deadlock. The crawler is unable to make progress or terminate.
Components are locally sound. Their interaction leads to an undesirable global property: the application may stall with very high page URLs fan-out.
There is a missing invariant: the crawler must always be able to make progress regardless of the load.
The liveness issue is an accidental global property. It is an architectural pressure: a signal that the crawler design doesn’t meet requirements. It needs immediate attention.
The Source of Tension
That tension originates from the initial, deliberate coupling between control-flow and data-flow in the Orchestrator.
That coupling also makes it very difficult to implement politeness and fairness.
These limitations were accepted for a controlled prototype. The deadlock was not – I stumbled upon the liveness issue.
Part 2. Conscious Architectural Evolution (From Emergence to Intent)
Planning Next Steps Guided By Pressure
The liveness issue and the difficulty in implementing key features are direct consequences of the current design: the Orchestrator couples work admission, control and data processing in a single component.
These pressures point to the next architectural evolutionary steps:
- Convert the
Orchestratorinto aWorker, responsible only for processing URLs - Introduce
Ingressto admit URLs into the system - Admission must never stall
Workers (it may block on I/O) - Control URL dispatch with a
Scheduler.
▶ Scheduler (algebra)
trait Scheduler[F[_]]: def dispatch: fs2.Stream[F, Uri]
▶ Ingress (algebra)
trait Ingress[F[_], T]: def publish(events: T): F[Unit] def publish(events: Vector[T]): F[Unit] def stream: fs2.Stream[F, T]
The initial control policy is simple: the Scheduler dispatches URLs to workers at a steady rate.
The Ingress sole responsibility is transport: it’s the entry point in the system and has well-defined semantics:
- Idempotently admit URLs into the system
- Expose a stream where each admitted URLs is emitted exactly once.
The diagram below demonstrates the revised architecture:

▶ FifoScheduler#dispatch
class FifoScheduler[F[_]: Temporal] private ( source: fs2.Stream[F, Uri], // from Ingress dispatchInterval: FiniteDuration // rate limiting work admission) extends Scheduler[F]: def dispatch: fs2.Stream[F, Uri] = source.metered(dispatchInterval)
▶ Ingress.makeQueuedIngress (in-memory)
object Ingress: def makeQueuedIngress[F[_]: Temporal, T]( queue: Queue[F, T] // unbounded queue ): Ingress[F, T] = new Ingress[F, T]: override def publish(event: T): F[Unit] = queue.offer(event) override def publish(events: Vector[T]): F[Unit] = events.traverse_(queue.offer) override def stream: fs2.Stream[F, T] = fs2.Stream.fromQueueUnterminated(queue)
▶ Worker#process
// Unbounded `targets` stream replaces queueclass Worker[F[_]: Async](targets: fs2.Stream[F, Uri], . . .): def run = targets.parEvalMap(maxConcurrency)(process) private def process(url: Uri): F[Unit] = def publish(newlyDiscovered: Vector[Uri]) = tracker.track(newlyDiscovered.size) *> ingress.publish(newlyDiscovered) (for htmlPage <- fetcher.fetch(url) pageLinks <- linkExtractor.extract(htmlPage) _ <- emitter.emit(pageLinks) acceptedLinks = pageLinks.links.filter(urlFilter.accept) canonicalisedLinks = acceptedLinks.map(UrlNormaliser.canonicalise) deduplicatedLinks <- canonicalisedLinks.toVector.filterA(link => deduplicator.hasSeen(link).map(!_)) // Publish new links to Ingress instead of offering to queue directly _ <- publish(deduplicatedLinks) yield ()) .guarantee { tracker.completed() }
These changes fix the liveness issue and unlock new options. They do not provide fairness or politeness yet, but introduce a central point where these new policies can be defined.
They also introduce a new pressure.
Pressure Shifts
In the original design, the Orchestrator both produced and buffered URLs using a single bounded queue.
Now, Workers immediately publish discovered URLs. The Ingress emits a stream of admitted URLs, and the Scheduler dispatches them to Workers at a controlled frequency. Because most webpages contain many links, Workers produce URLs far faster than they can process.
The in-memory Ingress implementation relies on an unbounded queue to hold pending URLs. This removes the deadlock, but shifts the pressure elsewhere: newly-discovered URLs grow without bounds, potentially leading to OOM.
The redesign does not eliminate the risks created by the in-memory queue. It isolates the risks.
If memory-growth becomes a problem, the in-memory Ingress can be swapped for a persistent implementation.
▶ PostgresIngress (durable)
object PostgresIngress: def make[F[_]](): Resource[F, Ingress[ConnectionIO, Uri]] = Resource.pure(new PostgresIngress())private class PostgresIngress private () extends Ingress[ConnectionIO, Uri]: override def publish(url: Uri): ConnectionIO[Unit] = Statements.insertUrl(url) override def publish(urls: Vector[Uri]): ConnectionIO[Unit] = Statements.insertUrls(urls) override def stream: fs2.Stream[ConnectionIO, Uri] = fs2.Stream.repeatEval(Statements.dequeueUrl).unNone
Since the architecture is defined by contracts, the other parts of the system remain unchanged.
The in-memory Ingress still provides several benefits:
- Extremely high throughput
- No external dependencies
- Fully non-blocking operations.
Keeping both implementations provides options for different execution environments: simple, fast, and riskier vs. safer, slightly slower and operationally heavier.
The crawler evolution thus far:
Prototype ↓Deadlock ↓Admission (In-memory ingress) ↓Control / URL processing separation ↓Optional durable ingress.
Part 3. Politeness & Fairness (A Gentle and Effective Crawler)
As it stands, the crawler can only be used under tight control, for example by restricting it to a single domain. Otherwise it risks behaving as a DoS tool.
To unlock its full potential – extracting meaningful signal from a diverse set of webpages – it must implement politeness and fairness.
Politeness requires an interval between requests to the same domain.
Fairness preserves throughput. While waiting for the cooldown period of one domain, the crawler can continue to make progress by fetching URLs from others.
With a clear separation between admission, control and processing, supporting the new control policies can be achieved by swapping the current FifoScheduler for a more involved Coordinator. The other parts of the crawler remain intact.
Enter the Coordinator
- The
Coordinatoruses a priority queue ofDomainQueues to find the next eligible URL - A
DomainQueueis initialised during admission, when the first URL for a domain arrives - The priority queue keeps the
DomainQueuewith the earliest next-eligibleURLat the head, from which the next URL is taken - After dispatching a URL, the
Coordinatoradvances the domain’s eligibility time by the cooldown period.
Since each domain has its own queue, hot domains do not starve lower-traffic ones.

Internally, a TreeSet ordered by (nextEligibilityAt, domain) provides priority queue semantics – preferred over a binary heap since reordering requires efficient removal, which TreeSet can handle in O(log n).
The Coordinator – the new Scheduler implementation – is more involved than the previous snippets. Both the source and the spec are available on GitHub.
The pressure that initially signaled a broken design has, in the end, helped shape a better one.
With a few remaining features in place, the crawler will reach a point where I can shift the focus from building a crawler to extracting signal from the vast web.
Reflections
In practice, the most careful and competent engineer will be surprised and think “How haven’t I seen this before?” at some point. And that is the work.
Intelligence, creativity and sophistication often compound the problem, particularly at a system-level. Early adoption of technologies, elaborate interactions between components and abstractions lead to unjustified complexity when not strongly grounded on actual necessity. This often translates into fragility. Boring and battle-tested is good.
A simple heuristic is to start simple and address emergent failure modes as they appear, identifying and isolating the architectural pressure as early as possible, using it to inform deliberate architectural evolution.
Missing invariants may be inevitable. The architectural pressure will surface them. Let it guide the next evolutionary step.

