Variant Experiment Server User Guide


Variant Experiment Server ∷ User Guide

Version 0.8, October 2017.

1Variant Server Overview

1.1New in Version 0.8

1.1.1Schema-Scoped Event Flushers

Event flushers are now defined in the experiment schema, using the /meta/flusher/ property. This allows for finer control of the final destination for Variant events. This is particularly relevant in combination with multiple schemata support, also new in this release. Configuration may still be used to define a server-wide default for those schemata which do not define their own event flusher.

1.1.2Multiple Schemata Support

You may now deploy multiple experiment schemata on an instance of Variant Experiment Server. Multiple schemata are convenient in the following use cases:

  • Support for multiple application domains:
    If your application environment consists of multiple domains, you, likely, want to separate them into independently managed units. For instance, if you manage a B2B SaaS application with multiple customers, you may want to factor out each customer’s experiments into a separate experiment schema. However, this doesn’t mean that you must run a separate instance of Variant server for each of them: a single Variant instance can handle a practically unlimited number of experiment schemata.

  • Separation between experiment and toggle schemata:
    The experiments defined for the purpose of feature toggling are likely to use an event flusher different from regular experiments because toggling experiments are not analyzed the same way the regular experiments are. Since all experiment in a schema share the same flusher, toggling experiments should be separated in their own schema(ta).

  • Support for continuous integration (CI) environments:
    In a modern concurrent development environment, your experiment schema will exist in multiple revisions on different code branches. All of your integration builds can deploy their experiment schemata to a single instance of Variant server, where they coexist independently from each other. All experiments can be tested in the framework of your continuous integration practices, removing the need for manual pre-production experiment verification.

1.1.3Revised Configuration Properties

See Variant Experiment Server Reference for complete list of supported configuration properties.

1.2Online Controlled Experiments Overview

Variant Experiment Server enables software developers conduct complex, full-stack, feature-scoped online controlled experiments on interactive, human facing computer applications. The need for such experiments arises whenever the business impact of an update to the application must be measured with respect to some business metric, such as sales.

A typical example is an eCommerce Web application: a change to the check-out experience will likely have a direct impact on sales. (While a change to an FAQ page will not.) Variant server enables application developers to validate a proposed update to user experience by running it in parallel with the existing experience, as a randomized controlled experiment, where the new code path is the treatment and the existing experience — the control.

In a Variant experiment, user traffic is split randomly (but not necessarily equally) between the two or more experiences, and performance data is collected for comparing their relative effectiveness with respect to the business metric(s) of interest. The experiment is run for as long as it takes for the measurements to reach statistical significance — a mathematical term meaning that enough traffic has passed through the experience that the observed difference is not likely due to chance alone.

1.3Variant’s Key Features

1.3.1Client-Server Architecture

Variant server provides experiment instrumentation service to a host application, which accesses it via a Variant client suitable for the particular programming environment. This architecture is particularly attractive to modern distributed applications which are comprised of multiple service components: each service that wishes to participate in an experiment communicates with the experiment server via a client library.

As of release 0.8.0, Variant supports the following client environments:

Java Fully functional Variant client with complete support for all Variant server functionality. Any component of the host application written in Java or another JVM language can integrate with this client. Several higher-level adapters are also available.
JavaScript Partial Variant client supports triggering of remote events from a Web browser environment.

Variant server is installed on the same network as the host application(s), either on promises or in the cloud, ensuring low network latency and proximity to operational data. Both are critical to enterprise applications: Variant’s runtime overhead is under 10 milliseconds per state request cycle, and Variant is able to use operational data for real-time targeting and qualification.

1.3.2Experiment Schema

Fundamental to Variant’s functionality is its Experiment Definition Model (XDM), allowing the experiment designer to define an experiment as an abstract idea, without any knowledge of the type of the application under test. The only assumption that XDM makes about the host application is that it is interactive, i.e. responds to and waits for external input. This input may come in a variety of ways: from a human via a GUI or an IVR, or from a program via an API, etc.

XDM operates on abstract concepts, like states and tests, which are high enough level to factor out the need to know how the interactivity is delivered, but low enough to exhaustively define a set of experiments. To instrument a new code path as an experiment, the application developer uses XDM to create a human readable experiment schema file, which Variant server then uses to route user traffic through the experiment. A single instance of Variant Server supports a practically unlimited number of experiment schemata, allowing multiple independent application domains run against a dingle Variant Experiment Server.

An experiment schema abstracts out of the application code all the logic required to instrument it for an experiment — the key idea, called declarative instrumentation. Experiment schema enables clean separation of experiment instrumentation from experience implementation: the application developer uses familiar development tools to implement the new experience(s), unconcerned with how these new experiences will be instrumented as an experiment. Consequently, application developers spend very little time instrumenting (and breaking down) experiments, making experimentation inexpensive and lowering the risk of instrumentation related bugs.

1.3.3Distributed Session Management

Variant server acts as the centralized session repository, accessible to any Variant client by the session ID. Variant maintains its own session, rather than relying on the host application’s, because it is frequently desirable for a Variant experiment session to survive the recreation of the host application’s session.

Variant’s session is distributed in the sense that Variant guarantees a consistent view of it across all clients that may be accessing it. In a distributed environment, multiple components of the host application may be handing a user action and hence working with the same session. Any change to Variant session’s state is seamlessly reflected in all the clients accessing it.

1.3.4Concurrent Experiments

A typical user facing enterprise application will run dozens of experiments every month. Many of these experiments will be instrumented on the same key sections of user experience, making them concurrent. Variant provides state-of-the art support for concurrent experiments, enabling different isolation models. With Variant, you will never have to delay a feature launch because it must wait for another experiment to conclude.

1.3.5Feature Toggles

Feature toggles leverage Variant’ powerful instrumentation mechanism with the goal of post-deployment testing. Whenever you roll out a new product feature, toggles enable you to first publish it to a limited population of users, while sending all others into the existing experience. If all goes well, you gradually increase traffic into the new code path until full production. But if a defect is discovered, the new feature is toggled off until the problem is fixed.

1.3.6Stable Targeting

Once a user session has been targeted for a test experience, it is desirable that the same user continues to see the same experience on subsequent visits, i.e. that targeting remains stable over time. Session-scoped targeting stability guarantees preservation of targeting information over the duration of Variant session, while experiment-scoped targeting stability refers to preservation of targeting information over the lifetime of an experiment, i.e. between Variant sessions.

Variant server provides session-scoped targeting stability automatically. When experiment-scoped stability is required, Variant provides two mechanisms, one clients-side and one server-side, to help you implement experiment-scoped targeting stability for recognized users.

1.3.7High Extensibility

Variant Experiment Server is made to be extended via the server-side Extension API. The central idea in ExtAPI is user hooks, or callback functions, posted by various server life cycle events. For example, the session qualification life cycle event is triggered when a user session first comes in contact with an experiment, and Variant server must decide whether the user is qualified for it. You can create a user hook for this event, which will augment the default functionality with specific qualification semantics based on live operational data, such as “only include organic search traffic.”

User hooks are defined in the experiment schema and can have scope of the entire schema, a particular test or a particular state. Hooks can also be chained, to help you modularize and reuse your code.

2Variant Architecture


Variant server is a standalone process, accessible at run time via a client library. It should be deployed on the network local to the host application and the operational database, facilitating low network latency and real-time integration with the host application’s operational data.

Variant’s functionality is cleanly divided between the server and the client:

Client Server
Session management.
Session tracking.
Targeting stability.
Traffic routing.
Exception handling.
Session management.
Session qualification.
Session targeting.
Targeting stability.
Schemata management.
Event logging.
Extension management.

The following diagram presents a high-level overview of the different components of Variant software platform:

Variant server does not have a persistent state. Each time the servers boots, it re-initializes its state. Server’s configuration is read from the config files and the experiments’ definitions are read from the schema files.

2.2Server Configuration

Variant server is configured via configuration keys, which are typically collected in config files. Variant uses the Typesafe Config library , an implementation of the HOCON  configuration grammar.

At startup, Variant server looks for configuration in the file conf/variant.conf. If it is found, its contents override the default settings, provided in conf/com/variant/server/boot/variant-default.conf.

This behavior may be overridden in the following ways:

% start -Dvariant.config.file=/path/to/alt/config/as/file

specifies alternate config file as a file system file. Alternatively, you may provide alternate config file as a classpath resource:

% start -Dvariant.config.resource=/path/to/alt/config/as/resource

The simplest way to add a resource to the server classpath is to place it in the server’s conf directory. It is an error to set both variant.config.file and variant.config.resource system properties.

Additionally, each individual config key may also be overridden by setting the likely named system property, like so:

% start -Dvariant.schemas.dir=/home/variant/schemas

For a complete list of supported configuration parameters, refer to the Variant Experiment Server Reference.

2.3Integration With the Host Application

A host application integrates with Variant server via a client library, suitable for its language. Variant client exposes Variant functionality via high level native methods and object bindings, easily consumable by the host application. Variant release 0.7 ships with a fully functional Java client and a partial JavaScript client, suitable for deployment to Web browsers.

2.3.1Variant Java Client

Variant Java client has no dependencies other than Java Run Time release 7 or later. This makes it consumable by any host application, such as a Web server, a RESTful server, or even a CTI application, such as call center,—so long as they are written in Java or any other JVM language, e.g. Scala. Throughout the documentation, we refer to this client library as the bare client.

Bare client’s flexibility comes at the expense of complexity, as it depends on several objects that the host application must provide at run time, e.g. an external mechanism to track Variant session ID. Luckily, most Java Web applications are written on top of the Servlet API. These applications should take advantage of the servlet adapter to the bare Java client. The servlet adapter wraps the bare Java API with a higher level client library, which re-writes environment-dependent function signatures in terms of familiar Servlet objects, like HttpServletRequest. The Servlet adapter preserves 100% of the bare client’s functionality and comes with out-of-the-box implementations of all environment-dependent objects. For more details, refer to the Variant Java Client User Guide.

2.3.2Variant JavaScript Client

Variant.js is a Variant client, which supports triggering of Varaint events from a Web browser environment. (This will be expanded to full functionality in an upcoming release in order to support single page Web applications and server-side JavaScript.)

In order to trigger an event, all the application programmer needs to do is:

<script src=""></script>
new variant.Event("event-name", "event-value").send();

For more information, refer to the Variant JavaScript Client User Guide.

3Experiment Definition Model (XDM)

3.1About Interactive Applications

The only assumption Variant makes of the host application is that it is interactive, i.e. responds to real time user input. Each user session traverses some user experience, provided by a computer program, by way of consecutive repetitions of the same structural pattern that resembles a dialog: the system and the user alternate in responding to each other’s response, figure 2 below.


The pattern is predictably symmetric: the user and the application do the same thing: 1) wait for the other’s response, and 2) respond. In the figure above, these two steps, as related to the computer program, are denoted with yellow background, while the same two steps, as related to the human, are left transparent.

3.2State Request Loop

From the user’s standpoint, each revolution around the conversation loop is a state transition: the application reaches a certain state when it presents the user with the prompt and pauses for user input. When user provides such input, in it is contained the intended next application state. For example, when a caller punches number “3” in response to a telephone menu, he initiates a specific state transition that takes him from his current state to the state denoted by number “3”.

The application accomplishes the state transition by processes user input and, depending on the outcome, presenting the next prompt. More precisely, a state transition has these three phases:

  • Process user input.
  • Figure out an appropriate next state.
  • Render the next state to the user.

Once the new state is rendered to the user, e.g. caller hears the next phone menu, the application pauses for new user input.

However, if there is an experiment instrumented on the requested state, there is an alternate code path that potentially alters or even replaces all three of these steps. Consequently, Variant must have a chance to target this session upstream of any host application code, as illustrated in the state request sequence diagram below:

State Request Sequence Diagram

  1. Host application receives user input, which contains some identification of the intended state transition. E.g., in a Web application a user input may be submitted as a form POST to a certain server resource path. Host application uses this information to figure out if this intended state is instrumented by any experiments. If not, the host application carries on unencumbered, as indicated by the dashed arrow lines.
  2. If the requested application state is instrumented by Variant, this user session needs to be targeted for all the tests that instrument this state. But first, host application must obtain a Variant session.
  3. Target this Variant session for the intended state. This operation packs most of Variant’s under-the-hood complexity. Its result is a Variant state request that, among other things, contains the session’s targeting information, e.g. the state variant to be served to the end user, instead of the control that was requested.
  4. The host application takes over and runs the code path consistent with the targeted state variant.
  5. State request is committed. This is the end in the life of a Variant state request and it cannot be used from this point on. One of the side effects of this operation is that Variant will flush the state visited event for the targeted state.

3.3From Request Loop To a Model

Central to Variant’s technology is the Experiment Definition Model (XDM). Its purpose is to capture the definitions of all tests running on a Variant instance in a single, human readable document external to the host application. This clean separation between experiment instrumentation and experience implementation means that the application developer can continue focusing on the application code, while the experiment designer is able to define the experiment in XDM, using nothing more than a text editor.

XDM is a self-contained abstraction, comprising a complete set of entities and relationships between them, required to fully describe a set of RCEs. To XDM, a state is a completely opaque concept: all XDM knows about a state is its name and that it may contain parameters whose meaning is external to XDM and only significant to the host application. A user experience under experiment passes from state to state, much like in a state machine.

As opposed to states, tests are actualized and managed by Variant XDM. They are the experiments instrumented on the states and, at a minimum, a test must have:

  • At least two experiences: control, typically mapped to the existing user experience, and at least one variant experience that represents the new code path under test. An experiment may have more than one variant experience, in which case it is traditionally referred to as multivariate.
  • A list of states on which the test is instrumented.

The test designer captures an instance of XDM in an XDM schema, like one in Listing 2.

3.4XDM Schema

An instance of XDM is called an XDM schema. It describes a set of experiments instrumented on a host application. Its principal benefits are:

  • Representational Clarity:
    Anyone can glean a complete understanding of all instrumented experiments by looking at the schema document.
  • Operational Compliance:
    A schema document is just another source file: it can be stored in a revision control system, securely managed by multiple developers, rolled back if bugs are discovered, etc.
  • Declarative Instrumentation:
    As already pointed out, one of Variant’s principal advancements is that it provides declarative instrumentation of experiments. XDM achieves this by taking instrumentation concerns out of the host application’s code and into a single schema document.

At this time, the only way to capture an XDM schema is via the JSON syntax . The following section introduces XDM concepts by example, while the complete reference is presented in the Variant Experiment Server Reference.

A minimal valid Variant XDM schema consists of a single state, instrumented by a single experiment with a single variant experience:

Here, we model an experiment that attempts to measure the consequence of the addition of the Recaptcha form to the password reset page. The meta section (lines 2-5) contains schema name, and description. When this schema is deployed to a Variant server, it will be known client code by that name.

Lines 6-10 contain the state definition with the single property name — an name string the test designer gave to the password reset page. The rest of the schema defines the single test called RecaptchaTest (line 13). Lines 14-24 contain the definitions of its two experiences, noRecaptcha and withRecaptcha. The isControl property on line 18 tells Variant that the noRecaptcha experience is the control experience for this test — a reasonable choice for a controlled experiment, which always seeks to use the existing code path as control, to which the new code path will be compared. The weight properties on lines 17 and 22 assign probabilistic weights to the two experiences, based on which Variant will randomly route 9 out of 11 user sessions to the control experience and 2 to the variant withRecaptcha experience.

The schema above is the minimal syntactically valid schema, but any practically useful schema will need to take advantage of state parameters to augment the variant states with application context that can be consumed by the host application at run time — as in the next example, borrowed directly from the Variant Demo Application , which comes with the Variant distribution. It introduces a number of important new concepts:

  • The test spans multiple states.
  • The test has two variant experiences, a.k.a. multivariate.
  • State parameters.
  • On/off tests
  • Nonvariant tests.

The test is instrumented on two states: newOwner and ownerDetail (lines 14-36). Both states have state parameter ‘path’, which the host application will use to map a state to the page by its resource path.

There’s only one test in the Pet Clinic schema, NewOwnerTest. It has three experiences (lines 41-55): the control and two variant experiences, which implement the two different updates to the page. The tosCheckbox experience adds the Terms Of Service acceptance checkbox, and the tos&MailCheckbox experience adds that and the email opt-in checkbox. User sessions will be targeted to either one of these three experiences randomly, based on their relative probabilistic weights: each will get roughly the same number of visitors.

Note the isOn property on line 40. By default, a test is on, so we didn’t have to have it in this case. But, if a test designer wants to temporarily suspend a test (which is different from taking it out from the schema altogether) this property can be set to false.

The sole NewOwnerTest is instrumented on two states: newOwner (lines 57-79) and ownerDetail (lines 80-83). The state newOwner has two variants (lnes 59-78) — one for each variant experience. The state ownerDetail is declared as non-variant (line 82) — the declaration that tells Variant to always render the control state variant, regardless of the targeted experience. The reason we want ownerDetail page in the test, even though we don’t have experience-specific variants for it, is event logging: when we analyze the test, we will want to know if user ever reached this state.

In this simple case, the variant space of the newOwner state consists only of three cells which correspond to the three experiences: outOfTheBox (control), tosCheckbox and tos&MailCheckbox. In those user sessions which get targeted to the control experience, the state’s base property value for ‘path’ on line 21 remains unchanged. But if Variant resolves the state request to either of the two variants, the value of the ‘path’ parameter is overridden with what is declared by the state variant on line 65 or 74.

This mechanism of state parameters overrides enables the host application simply access a parameter by name and receive a state variant specific value. In this case, the parameter is ‘path’: its value has no meaning to XDM, and its name is only important in so far as Variant overrides like-named parameters. Refer to Variant Demo Application  to see how this information is used by the Servlet adapter to properly route user traffic through server-side forwards.

4Mixed Instrumentation

Typically, all of test’s experiences will be instrumented on the same set of states. But it doesn’t have to be. For example, you may want to split a busy page into two more manageable pages, or to consolidate two sparse pages into a single more functional one. In these types of tests, a state will be instrumented by some, but not all a test’s experiences — the condition referred to as mixed instrumentation.

An example of a mixed-instrumented test can be found in the next section on figure 4. There, the Red test has two variant experiences, both of which are instrumented on state S2, but only one on state S3. In this case a variant instruments fewer states than the control. Likewise, Green test is also mixed-instrumented, but here the variant instrument more states (S3 and S4) than the control (S4 only).

Mixed instrumentation is supported by XDM with the isDefined element of the state variant definition, as illustrated in Listing 3 below. Whenever a state variant is declared as undefined, the following semantics will apply at run time:

  • If current session has not been targeted for this test, the targeting will be constrained to only those experiences that are defined on this state.
  • If current session has been targeted for this test, runtime exception will be thrown if user session attempts to target for that state by calling Session.targetForState() client API method

5Concurrent Experiments

Variant XDM offers full support for experiment concurrency, which is to say that any possible interleaving of two concurret tests can be defined by the XDM and, therefore, handled by Variant server at run time.

Whenever two tests have no states in common, they are called serial. Serial tests never interfere with one another, and hence their concurrent execution is equivalent to a serial execution. In the figure 4 below, the Blue and the Green tests are serial.

Concurrent Experiments

Whenever two tests share states, they are called concurrent. When a user session targets a state that is instrumented by two or more tests, there is an entire variant space of possible experience permutations. For example, in figure 4 above, state S2 is instrumented by Blue and Red tests. Blue test only has one variant experience and Red test has two variant experiences, so the complete variant space has 6 cells:

Variant Space

First, let’s consider a pseudo-serial execution, i.e. run the two tests in isolation, one after the other. To support the Blue test by itself, application developer would only need to implement the S2blue experience. Similarly, to support the Red test by itself, the application developer would implement its two variant experiences S21red and S22red. These three proper variant experiences occupy the peripheral portion of the variant space in Fig. 5 above.

But if the test designer were to run the two tests concurrently, he would have a problem to solve: what if a session gets targeted for variant experiences in both tests. This can be addressed in two ways: 1) stick with proper experiences only and let Variant target user session only to the supported subset of the variant space, or 2) achieve real concurrency by implementing the hybrid experiences. The first option can be thought of as pseudo-concurrent — no user session will be targeted for the Blue variant if it’s already targeted for a Red variant, and vice versa. This mode of constrained concurrence is referred to as disjoint concurrency: no two tests are in a variant at the same time.

The other mode is called covariant concurrency and is the true, unconstrained concurrency, where a session’s ability to participate in the Red test is not constrained by its participation in the Blue test, and vice versa.

By default, all concurrent tests in Variant are disjoint: no extra work on the part of the application developer is required to run two experiments in a disjointly-concurrent fashion. This is particularly useful in cases of high degree of concurrency, when states are instrumented by multiple tests. The disadvantage of disjoint concurrency is that it may starve downstream tests of traffic. Covariantly concurrent tests are free from this drawback, but require the application developer to provide the hybrid variants. When this is desirable, XDM provides complete set of abstractions to support arbitrarily complex covariantly-concurrent tests, as explained further in this section.

The relationship of concurrence between two tests has the following properties:

  • Symmetric: If a test T1 is concurrent with T2, then T2 is concurrent with T1.
  • Not Reflexive: a test cannot be concurrent with itself.
  • Not Transitive: If T1 is concurrent with T2 and T2 is concurrent with T3, T1 and T3 need not be concurrent.

Let’s illustrate the XDM definitions of concurrent tests with the schema for the Blue, the Red and the Green tests, introduced in figure 4. The schema below defines three tests on four states. State S2 is instrumented by the Blue and the Red tests and state S3 by the Red and the Green tests. The schema below implements the Read and the Blue tests as covariantly concurrent and the Green and the Red as disjointly concurrent.

The four states, on which the three experiments are instrumented, are defined on lines 5-34. Each state defines two state parameters, p1 and p2. As before, these parameters have no meaning to XDM — their significance is intrinsic to the host application. In each user session, the values of these parameters may be overridden with those declared by a state variant to which this state happens to resolve.

The Blue test, instrumented on states S1 and S2, is defined on lines 36-67. It contains definitions of its experiences (lines 38-46) and of the state variants corresponding to these experiences (lines 47-66). The experiment has two experiences: grey (control) and blue (sole variant). There are no weights assigned to these experiences, which is not an error, so long as the test designer supplies a custom targeter for this test, as explained in section 8.1 User Hooks. The two state variants for the sole variant experience blue are defined on lines 47-56 and 57-65. Each of these define an optional set of parameters, whose values, if named as a base parameter, will override the base value.

The Red test is instrumented on states S2 and S3. On line 70 it declares its covariance with the Blue test. This declaration makes XDM expect definitions of all hybrid state variants for all states on which both Red and Blue tests are instrumented: in this case S2.

The Red test’s experiences are defined on lines 71-82. As opposed to the Blue test, it has two variant experiences red_1 and red_2 — the case typically referred to as multivariate. The state variants are on lines 84-116 for the state S2 and lines 117-129 for state S3. The latter only contains the Red test’s proper state variants, but the former is the interesting case where, in addition to Red test’s proper variants the test designer must also provide the hybrid variants, lines 95-114.

Finally, the Green test, instrumented on states S3 and S4 is defined on lines 132-167. It is concurrent with the Red test, because they both instrument state S3. But in this case, the test designer chose not to provide the hybrid state variants and let Variant default to disjoint concurrency: Variant will not target a user session for a variant experience in both the Red and Green tests.

6Running Experiments

6.1Runtime Overview

At runtime, Variant server has the following principal responsibilities:

  • Session Management
  • Session Qiualification
  • Session Targeting
  • Targeting Stability
  • Event Logging
  • User Hook Management

The following sections consider these functions in detail.

6.2Session Management

Variant has its own notion of user session, independent from that of the host application. Variant session provides a way to identify the user across multiple state requests and contains session-scoped application state that must be preserved between state requests. Variant server acts as the centralized session repository accessible to any Variant client by the session ID. It is the responsibility of the host application to hold on to the session ID between calls to the server.

Variant maintains its own session, rather than relying on the host application’s, because it is frequently desirable for Variant session to survive the destruction of the host application’s session. For example, if the host application is a Web application, it natively relies on the HTTP session, provided to it by the Web container, like Tomcat. If a Variant experiment starts on a public page and continues past the login page, it is possible (in fact, quite likely) that the host application will recreate the underlying HTTP session upon login. If Variant session were somehow bound to the HTTP session, it would not be able to span states on the opposite side of the login page. But because Variant manages its own session, the fate of the host application’s HTTP session is irrelevant, enabling Variant to instrument experiments that start by an unknown user and end by an authenticated one or vice versa.

Variant’s session is distributed in the sense that Variant guarantees a consistent view of it across all clients that may be accessing it. In a distributed environment, multiple components of the host application may be handing a user action and hence working with the same session. Any change to Variant session’s state is seamlessly reflected in all the clients accessing it.

6.3Session Qualification

Variant supports the notion of session qualification for an experiment. As an example, suppose a newspaper is interested in testing promotional discount rates, offered on its website, but only to the traffic that comes from organic search. This means that all the traffic coming from anywhere else, e.g. an online ad that might be offering a different promotion, must be excluded from the test.

By default, all sessions qualify for all tests. To add custom qualification semantics, the application developer needs to implement a listener for the UserHook<TestQualificationLifecycleEvent>  user hook. This hook is posted when the test qualification life cycle event is triggered, whenever a session first hits a state instrumented by a live test. This event fires only once per user session per test: consequently, a session’s qualification remains in effect during its lifetime, even if the qualification condition’s truthhood has changed.

When a session is disqualified from an experiment the following applies:

  • It is temporarily assigned to (but not targeted for) the control experience in that exepriment.
  • No variant events will be triggered with respect to disqualified tests.
  • UserHook<TestQualificationLifecycleEvent>  user hook has the ability to discard the entry for this test from the targeting tracker.

6.4Session Targeting

Once a session has been qualified for a test, Variant must target it for one of that test’s experiences. Even in the case of a single test, targeting decision is not a simple coin toss because Variant must consider user’s recent browsing history—the test designer typically wants to ensure that a returning user sees the same experiences as he’s already seen. However, the complexity of this decision grows dramatically for concurrent tests where certain targeting outcomes may not be feasible. Figure 6 below present the steps involved in targeting.

  1. The host application must figure out if a state is instrumented by Variant. State parameters must be used by the application developer to attach enough application context to a Variant state, so that the host application can figure this out. If the requested application state is not instrumented, Variant defers back to the host application.
  2. If the state is instrumented, Variant determines all the tests that are instrumented on it. If more than one, we have the case of concurrent tests. For each of these tests:
  3. If the current session has not yet been qualified for this test, Variant posts qualification hook listeners. If user code disqualifies this session, it will not be cognizant of the test.
  4. If qualified, Variant examines the content of this session’s targeting tracker, explained in detail in the next section. It contains information about user experiences that this session has already been targeted for earlier in this session. If the test has an entry in the targeting tracker and it is resolvable in the current schema, the entry is honored.
  5. If the test does not have an entry in the targeting tracker, or the existing entry is not resolvable in the current schema, then the session is targeted. First, Variant posts targeting hook listeners. If user code returns an experience, it is honored, unless unresolvable in current schema.
  6. If user code does not return an experience, Variant delegates to the default probabilistic targeter.

If there are concurrent tests that have already been targeted for this session, this test is not eligible for a variant experience if any of the concurrent tests are disjointly concurrent with this test and targeted for a variant experience. In this case, the session is targeted for this test’s control experience—the case of constrained targeting.

Finally, if there are no disjointly-concurrent tests that have already been targeted to a variant experience, this test’s targeting is unconstrained. Variant triggers the TestTargetingLifecycleEvent . This will post any UserHook<TestTargetingLifecycleEvent>  user hooks applicable for this test. If any of them returns an experience, it is honored. Otherwise, if no hooks were defined in the schema or none has returned an experience, Variant runs the default targeter, which picks an experience randomly, with probabilities defined by the weight property of the test experience XDM definition.

As soon as a session is targeted, the target experience is added to the targeting tracker.

6.5Targeting Stability

Targeting stability refers to the preservation of targeting information between state requests. Session scoped targeting stability guarantees such preservation for the duration of Variant session and is provided automatically by Variant server: once a session has been targeted for an experience, it sticks with the chosen experience for the rest of the session.

Targeting stability between sessions is referred to as experiment scoped, i.e. that a returning user sees the same experience even if his session has expired. Variant cannot provide this feature automatically, but offers two mechanisms for the application developer to achieve it:

6.6Event Logging

Variant events are the elementary data points generated by Variant experiments, which can be persisted to external storage for analysis. They may be generated implicitly, at certain predefined points in the life of an Variant session, or explicitly by the client code. In either case, the application developer may attach arbitrary attributes to these events, if these attributes will be useful during analysis.

The only automatically triggered event is the state visited event. It is created at the end of the targeting step, Figure 3, and is added to Variant session, but not yet triggered. This gives the host application a chance to attach additional attributes to the event, if it wishes to. For example, if the host application caught an unexpected error, it may wish to set the status of the event to error, so that it can be excluded from analysis. The state visited event is triggered when the host application commits the state request object, after it completes the code path corresponding to the targeted state variant.

Custom events can be triggered by calling an appropriate client API method, e.g. Session.triggerEvent()  in the Java client.

7Analyzing Experiments

7.1Data Transformation

Each Variant experiment is designed with a particular target metric in mind. For instance, in our running example of the Pet Clinic application, each datum represents a page view event, either of the newOwner page or the ownerDetail page, and the metric of interest is the conversion rate from the former to the latter.

The raw event data generated at run time cannot be directly used for statistical analysis because they are too low level. They must first transformed into the higher level data, representing the target metric of interest,—the conversion rate in our example. Typically, this is accomplished with SQL or a data transformation tool, but may take a custom data transformation program. A distributed data processing framework, like Hadoop , can also be successfully deployed to transform Variant raw data.

7.2Statistical Analysis

The goal of an experiment to

  • Discover if there is a difference between control and variant experience(s) with respect to the target metric of interest;
  • Assess the likelihood that this difference can be explained by chance alone.

The latter can be accomplished with some well-known mathematical formulas developed in the field of statistical hypothesis testing. The fundamental idea there is to develop a procedure that will enable the researcher to make a claim about the entire population with a given degree of certainty, based on a set of sample observations. Refer to the Statistical Analysis of Variant Experiments white paper for more information.

8Feature Toggles

Feature toggles are closely related to online controlled experiments. They use the same instrumentation mechanism, but with the goal of post-deployment testing. Whenever you roll out a new product feature, toggles enable you to first publish it to a limited population of users, while sending all others into the existing experience. If all goes well, you gradually increase traffic into the new code path until full production. But if a defect is discovered, the new feature is toggled off until the problem is fixed.

You can leverage Variant’ powerful instrumentation capabilities to implement feature toggles. Customer targeting hooks can be used to manage which user population is sent into the new code path. If a defect in the new code path crops up, it is toggled off by simply flipping the ‘isOn’ property to false. All traffic to the new code path will stop without interrupting the host application. After the problem is fixed and retested, and the host application redeployed, you may flip the ‘isOn’ property back to on to allow user traffic back into it.

You can also use Variant’s custom events to trace the new code path even if you are not interested in comparative performance analysis.

Typically, you will group all of your toggle experiments in a separate schema, as in Listing. 8.1 below:

There are two toggle experiments in this example, Feature1 (lines 23-57) operates on pages page1 and page2, and Feature2 (lines 58-97) operates on pages page2 and page3.

Both toggle experiments use the null event flusher, which comes with Variant Experiment Server (lines 4-6), which discards any events generated by them. (More likely, you will use a custom flusher that saves Variant events to persistent storage, as they may help with debugging, should problems crop up.)

The two experiments in Listing 8.1. take different approaches to user targeting. The Feature1 experiment uses the default probabilistic targeter, which assigns new visitors to either the old or the new experiences randomly, with, on average, one in a thousand visitors going into the new.

The Feature2 experiment uses the targeting hook class my.domain.ZipCodeTargeter which targets sessions for the new experience so long as the user lives in one of the configured zip codes.

9Variant Server Extension API

Variant server’s functionality can be extended through the use of the server-side extension API, or ExtAPI. It provides a mechanism for injecting custom semantics into the server’s regular execution path:

  • User hooks are callback functions that can be attached to certain life cycle events, to allow custom application logic to take over and alter the default semantics. User hooks are configured in the experiment schema.
  • Event Flushers allow the application developer to alter the persistence details of Variant events. They are configured in the server config file.

See Variant Experiment Server Reference for complete details.

9.1User Hooks

User hooks are listeners for events, which are triggered by Variant server at well-defined points of its life cycle, called life cycle events. Hooks are defined in the experiment schema, e.g. on lines 73-82 of Listing 2, where we have defined two hooks: FirefoxDisqualifier , which disqualifies all traffic coming from a Firefox browser, and ChromeTargeter , which targets all traffic coming from a Chrome browser to the control experience.

Depending on where in the experiment schema a user hook is defined, it may have the scope of the entire schema, of a state or of a test. Wherever it is defined, a hook definition must provide the name of the class implementing the hook, which will be looked for at schema parse time.

9.2Pluggable Event Flusher

An event flusher is a configurable module which is responsible for writing out Variant events to some form of external storage. These events are then used for analysis of Variant experiments. The most common Variant event is the state visit event. It is triggered automatically, whenever a session visits a state with at least one live test. User code may also generate custom event.

Whenever a Variant event is triggered — implicitly by Variant server or explicitly by user code — it is picked up by the asynchronous event writer, where it is held in a memory buffer until a dedicated flusher thread wakes up and flushes it out to external storage. The size of the event buffer is configurable by the variant.event.writer.buffer.size config key. The larger the buffer, the better the event writer will cope with bursty inputs, but at the price of additional memory footprint. Whenever the flusher thread is not keeping up with the event load, event writer will discard new events until the flusher thread has consumed some of the pending events.

The actual writes are handled by a configurable flusher class. Variant comes with a few such flushers ready to use out-of-the-box, although most likely you will want to implement your own, suitable for your particular operational environment. A custom flusher must implement the EventFlusher  interface and is configured via the /meta/flusher/ schema property. All events generated by the experiments contained in the schema will be routed via this flusher. Each schema gets its own instance of the flusher.

If no flusher is defined by a schema, the server-wide default, as configured by the config property, is used.