Effective Fakes in testing

Test doubles are typically commonplace in any decent automated test suite, though I’ve found that the value you get from your tests and the cost of rework heavily depends on how you use doubles.

A common approach is to heavily favour mocking frameworks to scaffold these up, defining adhoc behaviour (usually per test) against some kind of interfaced dependency of the subject under test. This might be a class with several interfaces injected into the constructor, it might be a “full” service spanning the API ingress all the way to a mocked-out messaging or data access layer.

I moved from primarily mocking to faking when setting up doubles, and now I advocate for it. Let me just be clear on what I mean by this:

Mocking: Programmatically configuring ad-hoc interactions for dependencies, typically ignoring input for fixed output for ease of configuration. Interaction-oriented (e.g. I have called this function exactly once with these parameters).

Faking: Building a specific drop-in replacement for the dependency which facilitates testing, reusable across tests, as powerful or weak as you make it. State/behaviour oriented.

Examples of fakes

We could fake:

A repository object that provides access patterns to a dog table in an animal database, allowing storage, retrieval and statistical queries over in-memory data.
A feature flag service that receives input like flag id, tenant id, and returns values from a simple in-memory database.
A message queueing library that allows you to publish events and commands

Anything you can mock you can fake, though I mostly recommend faking at the boundaries of your application - all the examples above represent where data is sent or retrieved from outside application code. I would not test a single module isolated of all others as fakes, for example, when other modules are just broken down and organised parts of a bigger “whole”.

So, do I mock internals instead? Simply put, no (unless there’s truly a really good reason to). I found this hard to grasp a while ago but I no longer find it valuable to have every single part of the system tested in complete isolation - testing real code is much more effective. Fake out the boundaries, wire up the internals together, and if you start finding tests painful to write then you might be falling foul of actual production code design - mocking things to make things easier to test could be hiding the complexity and increasing risk.

What does a fake look like?

Let’s take one example above - “A message queueing library that allows you to publish events and commands” where the real implementation is a typescript module with two exported functions:

// messageBus.ts
export type SendCommand = (command: Message) => Promise<void>
export type PublishEvent = (event: Message) => Promise<void>

We could build a stateful module with:

public arrays of all received messages/commands
some type-safe adherence to the real module
easy ability to reset between tests
ability to simulate communication errors

// fakeMessageBus.ts
import type * as MessageBus from '@lib/messageBus'

let fault: Error | undefined;
export const Commands: Message[] = []
export const Events: Message[] = []

export const ResetFake = () => {
  Commands.length = 0
  Events.length = 0
  fault = undefined
}

export const SetFault = (error: Error) => {
  fault = error
}

const fake = {
  SendCommand: (command: Message) => {
    if (fault) {
      return Promise.reject(fault)
    }
    Commands.push(command)
    return Promise.resolve()
  },
  PublishEvent: (event: Message) => {
    if (fault) {
      return Promise.reject(fault)
    }
    Events.push(event)
    return Promise.resolve()
  }
} satisfies typeof MessageBus // This enforces a full module type check - not necessary but a nice way to catch api changes

// These exports serve the goal of making this module a full replacement
export const SendCommand = fake.SendCommand
export const PublishEvent = fake.PublishEvent

And if you were faking something other than a module, like a class, generally the pattern is the same with some small differences. A class may not necessarily need an explicit reset function as you can probably just re-instantiate it per test and have some native isolation there.

Pros and cons of faking

From my own experience I’ve found these positives:

Easier and more elegant to express stateful testing. You’re not using the DSL of a mocking API to seed and retrieve state, you’re often using plain old references to public variables on the fake.
Fakes that behave in an equivalent (but often simplified) manner to the real thing help behavioural tests - don’t need any bespoke or special setup.
Changes to reflect real behaviour changes can instantly highlight problems across the test suite, leading to less maintenance overhead
With a fake already created, there is little-to-no overhead for those unfamiliar with the real module to start using it. Tests will prove they’re using it right or wrongly, they don’t need to recreate the API interactions.
With reuse, there is less opportunity to make a mistake when setting up a test - harder to mock a method’s behaviour wrong when you don’t have to mock it each time.
Where you cannot test against the real thing outside of production, they can act as decent mini simulators if you build them from detailed analysis of the real dependency. This can pay heavily in e.g. cloud/serverless native software where you cannot run anything meaningful locally without a lot of work and expense.
Much more resilient to application refactors:
- If you already have the fake wired up and you change client code to start calling it, it doesn’t require additional mocking to facilitate that, just to assert new outcomes
- Changes to the API of the real thing can be less impactful across the test suite when you modify the fake - single source of truth etc

And this is not a panacea, there’s negatives and other things to consider:

It can be more upfront work to get a first passing test - mocking out a method to get a list of all customers is easy to add in your first test using this functionality probably in a single expression, but an effective fake will likely need a means for you to seed it, and then an implementation that reads from the pre-seeded data.
Absolutely no guarantees that the fake will behave in a way that makes your tests valuable (this is true of all test doubles though which is why I favour mocking/stubbing/faking as little as is reasonable)
It can be more maintenance as changes to the fake affect all tests, whereas people typically don’t centralise mocking in any way that would have this effect. This is maybe more of a pattern of usage than a particular downside of either/or and nothing stops you building more new fakes for different purposes, but you lose out on some of the advantages by doing that and accrue more negatives.
If you change the implementation of the real thing in a way that would affect how behaviour or state works, then you have to also reflect that in the fake or the deviation may lead to a false sense of security across the codebase. This can also be true of mocking and in my experience, a much bigger problem to remediate there.
It can look like a lot of odd work and less sophisticated than using a mocking framework, meaning it can be viewed by some teams as a questionable step from standard or “best” practice, where both options are just seen as a means to an end.

What’s in a good fake?

To me, a good fake has these traits:

Is not overly-complex or modeling significant business behaviour
Provides useful additional functionality to test code (exposed data, helper methods to set up specific scenarios/seed data)
Is a drop-in replacement, doesn’t require convoluted test setup to intercept/redirect real function calls
Hits the right balance of a simplified experience
- A fake messaging library does not need to start invoking other parts of the system and dequeueing, so long as you can validate the messages you can do contract testing and use a real one for end-to-end/integration testing.
- A fake CRUD repository would be much better working as a simple memory store/cache rather than expecting you to setup the data for each response.
Could feasibly be used in the running software to provide a simulated experience
Once built, cheap and preferable to reuse, people shouldn’t be reaching for the mocking library.

I think a good fake also lives close to the real deal - if you are building libraries or SDKs and you can distribute fakes in lockstep with the real code then you are providing such a powerful tool to consumers of your library. This is no easy task and requires discipline, but I’ve found it works very well with internal company utilities. A distributed fake isn’t guaranteed to be perfect for your needs, but it’s test code - take a copy and edit if you must, build your own, or don’t use it if it is more bother than its worth!

Can’t I just achieve a lot of this with my mocking setup?

Yes, actually. I’ve spoken a lot about typical use of mocking, where things are set up ad-hoc and repeated and people validate interaction details over stateful/behavioural outcomes, I do acknowledge that a lot of problems with mocking are in the application of it but I don’t think that invalidates some of the other benefits, and it’s worth considering that mocking frameworks make it so easy to go wrong like this. With good practice and discipline, you can centralise your mocking config, facilitate some stateful assertions, start adding builder methods to scaffold well-defined behaviours for a function (like setupMockToRejectTheFirstCall). I’ve done this in the past when using a BDD framework to build reusable Given/When/Then steps to setup functionality.

But I don’t think any of that is simpler, and it requires dependency on a mocking framework and understanding the API of that framework. Fakes are usually plain old code, so much simpler, functionality and state colocated by nature (takes effort to scatter around) and more hassle than its worth to setup the kind of interaction testing patterns that I think don’t hold value (insofar as “I have called function X with parameters Y, Z exactly N times”).

Do fakes deliver value?

Ultimately we don’t code to write the most correct or best option at all times - we get paid to deliver value. In my experience, fakes help me deliver more value faster and with more confidence as they are simpler to understand, reusable in the right way, facilitate testing patterns that I favour more for the confidence they give me (state, behaviour), are more refactor resistant and easier to onboard people with. That’s why I prefer them, I don’t think the investment in building them is particularly big, and I do think it pays off particularly well. I wire them up, they get out of my way, I write my tests.