How to go beyond web A/B testing with Optimizely Feature Experimentation

Published February 16, 2023 by Jacob Pretorius – Tech Lead & Optimizely MVP

Feature Experimentation, or Full Stack Experimentation as it was previously known, is an Optimizely product that enables a data-driven experimentation approach across any technology or device stack, including website, mobile, TV and IoT apps.

In this post, which follows on from our previous Web Experimentation article, we explain how Feature Experimentation goes beyond A/B testing to enable product teams to deliver higher quality releases, run safer tests and validate new features at scale.

This article will help you gain an understanding of how Feature Experimentation can be used in different scenarios, and how it offers greater control to deliver results at scale.

What Feature Experimentation can be used for

Feature Experimentation makes it possible to test the benefits of any product feature, regardless of device or complexity. Here are three examples to bring its capabilities to life:

A new membership flow for a digital publication (web and app based). Test if users see a new or existing membership plan, enter a new or existing membership sign-up journey, and find the right frequency of reading interruption for the highest uptick in new memberships.
Enabling beta features early for dedicated self-driving car fans (app and in-car based). Control what percentage of users receive new functionality and give those users the ability to opt in or out from each beta feature as it is refined using the car makers app.
An airline offering new ways members can redeem rewards at self-check in (app, web, and check-in terminal based). Test which membership-based upgrades or rewards get redeemed the most when customers are checking in online, or in person at the airport devices.

Feature Experimentation capabilities

Product teams can take advantage of the multiple features that are offered by Optimizely Feature Experimentation. These features go further than singular web-based experimentation products like Optimizely Web Experimentation or the soon-to-be sunset Google Optimize.

Targeted and scaled rollouts – control over which types of users see a variation of a feature. Pick an audience with precision using advanced filters and attributes.

Test new features in production using feature flags – minimise the risk with feature flags by allowing development teams to test new features in production.

Subgroup testing – known as canary testing which allows teams to test a new feature on a subgroup of end users before rolling out to a wider audience.

Real time results – gain an instant understanding of the impact of experiments with results in real time. Optimizely stats engine uses sequential error control and machine learning for accurate results.

Client-side and server-side SDKs – run numerous experiments across the digital estate without negatively effecting performance.

What are feature flags?

Feature flags (also known as feature toggles) are a technique used in product development where a feature's code can be turned on or off seamlessly at any time without releasing or deploying a new version of the product.

Feature flags allow teams to experiment with new features, test them with a subset of users, and know with confidence if a feature achieves the desired result or not.

Our feature flag demonstration using Apple Pay

As a demonstration for this article, a new payment method, Apple Pay, is being tested for users on iOS devices (web and app based). While it usually makes sense to always have Apple Pay for iOS users, the example merchant’s existing payment processor does not support Apple Pay.

This means that along with enabling Apple Pay visually for users of the website or app, a new payment processor is also integrated in the back-end with support for Apple Pay. However, the new payment processor has a much higher transaction fee compared to the existing payment provider.

From a commercial perspective, a key question is whether it's worth the higher transaction fee for the convenience of Apple Pay leading to an increase in overall revenue with iOS users. Or are iOS users motivated enough to checkout using the existing payment methods leading to no change in overall revenue?

This problem can be addressed definitively with data using Feature Experimentation and feature flags.

Effective rollout with feature flags

Rollout with feature flags is a technique that allows teams to gradually release new features and functionality to a targeted subset of users, rather than releasing them all at once to the entire user base.

This allows teams to test new features in a controlled scope, gather feedback, and make any necessary adjustments before releasing the feature to the entire user base.

Here is an example of how to use rollout with feature flags:

1. Create a feature flag: the first step is to create a feature flag that can be used to enable or disable the new feature. In this example the feature flag is to enable/disable a new payment process using Apple Pay.

2. Set the initial state of the feature flag: initially, the feature flag is set to "off" so that the new feature is not available to any users.

3. Gradually roll out the feature: next, the team can gradually roll out the feature to a small audience. In this example, the team chose: 20% users in the UK who are between 18 and 25. To do this, we created a new custom audience and used the targeted delivery feature.

4. Monitor the performance of the feature: while the feature is being rolled out, teams should monitor the performance of the feature using the real-time analytics or external user feedback tools such as in-app chat (if available).

5. Make any necessary adjustments: based on the real-time performance data, teams can make any necessary adjustments to the feature before rolling it out to the entire user base.

6. Gradually expand the rollout: once the team is satisfied with the performance of the feature, it can gradually expand the rollout to the entire user base.

A/B testing features that offer control

Similar to the functionality available in Optimizely Web Experimentation, Feature Experimentation makes it possible to run A/B or Multi-Armed Bandit tests on feature flags. This makes it possible to turn a feature ‘on’ or ‘off’ and allocate users so it can be determined which variation of the feature is the best performing for each business objective.

As an example, teams can create a simple A/B test including only 25% of users across all channels at a 50/50 split using the Apple Pay feature flag used previously.

This test will determine whether enabling Apple Pay for those 25% of users led to an increase in revenue or had no positive effect.

Using variations adds depth to results

A/B Tests are effective for simple on/off functionality but often not dynamic enough to unleash a feature to its full potential. Variations add extra richness to feature flags where the flag can still be ‘on’ or ‘off’ with other variables in different states for the developers to integrate alternate UX variations or user journeys.

Using the Apple Pay example, instead of just turning it ‘on’ or ‘off’, Apple Pay can be enabled with different button texts while we allow the platform to determine the best performing variant automatically.

It is achieved by setting up a default variable (String is used in this example; Numbers, Boolean, and custom JSON objects also supported).

The next step is to create the variations and their values. Here, we use ‘Payment (Apple Pay Secure)’ instead of the default ‘Pay with Apple Pay’ button text.

These variations can then all be tested until enough users have interacted with each option to determine the best outcome.

Using the free rollouts account to get started

Optimizely offer a free rollouts plan which can be used to experiment with feature flagging, and it has the capacity to run one concurrent A/B test at no charge.

It is a production-ready plan capable of showing just how powerful a data-driven approach can be to help steer an iterative design and development process.

The development kits available

Optimizely Feature Experimentation comes with SDKs for both client-side and server-side implementations in many popular languages (Java, C#, Javascript, Go, PHP, and more).

Each option comes with its own benefits and are better suited to specific scenarios. Both can be used simultaneously if needed to test a feature end-to-end.

Client-side SDKs are typically used to implement experiments on the front-end of a website or mobile app. The main benefit of using a client-side SDK is that it allows teams to quickly and easily make changes to the user interface without having to make any changes to the backend.

This can be useful for experiments that focus on the user journey or experience, such as enabling or disabling specific functionality or visual elements, that do not rely on server-side changes to work.

Server-side SDKs, on the other hand, are typically used to implement experiments on the backend of a website or mobile app. The main benefit of using a server-side SDK is that it allows teams to make deep rooted changes to the underlying logic and functionality of a website, app, or service.

This can be useful for more complex experiments that focus on features such as payment, checkout flows or product recommendations, all without introducing additional performance issues.

Combining Feature Flags with Rollout means teams can enable and validate experiments across all their touchpoints from one central portal.

Real-time metrics

As with Web Experimentation, results are available in real time giving product teams the power to enable, disable, or scale rollouts without having to wait hours or even days for analytics to become available.

These metrics help make informed decisions on features and where to best spend design and development time for each iteration of the product development process.

Wrapping up

Feature Experimentation is a very powerful product that can be leveraged in countless ways to validate user experiences at scale. It can also be used in conjunction with Web Experimentation to power a holistic experimentation design and development approach.

If you would like to learn more about how Feature Experimentation could support your product, digital and development teams to deliver against their strategic goals, whether that be increasing customer acquisition, decreasing support costs or driving customer loyalty, please don't hesitate to get in touch.