Skip to main content

Contextual Bandits

Usage with Contextual Multi-Armed Bandits

Eppo also supports contextual multi-armed bandits. You can read more about them in the high-level documentation. Bandit flag configuration--including setting up the flag key, status quo variation, bandit variation, and targeting rules--are configured within the Eppo application. However, available actions are supplied to the SDK in the code when querying the bandit.

To leverage bandits using the PHP SDK, there are two additional steps over regular feature flags:

  1. Add a bandit action logger to the SDK client instance
  2. Query the bandit for an action

Define a bandit assignment logger

In order for the bandit to learn an optimized policy, we need to capture and log the bandit's actions. This requires defining a bandit logger in addition to an assignment logger.

[Code Placeholder: Example of implementing IBanditLogger interface in PHP]

The SDK will invoke the logBanditAction() function with an IBanditEvent object that contains the following fields:

timestampstringDefault: undefined

The time when the action is taken in UTC as an ISO string. Example: "2024-03-22T14:26:55.000Z"

flagKeystringDefault: undefined

The key of the feature flag corresponding to the bandit. Example: "bandit-test-allocation-4"

banditKeystringDefault: undefined

The key (unique identifier) of the bandit. Example: "ad-bandit-1"

subjectKeystringDefault: undefined

An identifier of the subject or user assigned to the experiment variation. Example: "ed6f85019080"

subjectNumericAttributesarrayDefault: []

Metadata about numeric attributes of the subject. Map of the name of attributes their provided values. Example: {"age": 30}

subjectCategoricalAttributesarrayDefault: []

Metadata about non-numeric attributes of the subject. Map of the name of attributes their provided values. Example: {"loyalty_tier": "gold"}

actionstringDefault: undefined

The action assigned by the bandit. Example: "promo-20%-off"

actionNumericAttributesarrayDefault: []

Metadata about numeric attributes of the assigned action. Example: {"brandAffinity": 0.2}

actionCategoricalAttributesarrayDefault: []

Metadata about non-numeric attributes of the assigned action. Example: {"previouslyPurchased": false}

actionProbabilityfloatDefault: undefined

The weight between 0 and 1 the bandit valued the assigned action. Example: 0.25

optimalityGapfloatDefault: undefined

The difference between the score of the selected action and the highest-scored action. Example: 456

modelVersionstringDefault: undefined

Unique identifier for the version (iteration) of the bandit parameters used to determine the action probability. Example: "v123"

Querying the bandit for actions

To query the bandit for an action, you can use the getBanditAction() function.


$flagKey = 'my-bandit-flag';
$subject = 'user-123';
$subjectContext = ['accountAge' => 0.5, 'country' => 'US'];

// A simple list of actions with no context attributes
$actions = ['nike', 'adidas', 'reebok'];

$result = $client->getBanditAction($flagKey, $subject, $subjectContext, $actions, 'control');

Example 1.2: Actions with un-grouped context

$actions = [
'nike': [
'brandLoyalty' => 0.0,
'size' => 5,
'colour' => 'red'
], ...
];

$result = $client->getBanditAction($flagKey, $subject, $subjectContext, $actions, 'control');

Subject context

The subject context contains contextual information about the subject that is independent of bandit actions. For example, the subject's age or country.

$subjectContext = new AttributeSet(
numericAttributes: ['accountAge' => 0.5],
categoricalAttributes: ['zip' => 90210, 'country' => 'US']
);

$actions = [
'nike': new AttributeSet(
numericAttributes: ['brandLoyalty' => 0.0],
categoricalAttributes: ['size' => 5, 'colour' => 'red']
), ...
];

$result = $client->getBanditAction($flagKey, $subject, $subjectContext, $actions, 'control');

Action contexts

The action context contains contextual information about each action. They can be provided as a mapping of attribute names to their contexts.

$flagKey = 'my-bandit-flag';
$subject = 'user-123';
$subjectContext = ['accountAge' => 0.5, 'country' => 'US'];

// Actions with structured context
$actions = [
'nike': new AttributeSet(
numericAttributes: ['brandLoyalty' => 0.0],
categoricalAttributes: ['size' => 5, 'colour' => 'red']
), ...
];

$result = $client->getBanditAction($flagKey, $subject, $subjectContext, $actions, 'control');

Result

getBanditAction() returns two fields:

  • variation (string): The variation that was assigned to the subject
  • action (string | null): The action that was assigned to the subject by the bandit, or null if the bandit was not assigned
$result = $client->getBanditAction($flagKey, $subject, $subjectContext, $actions, 'control');
if ($result->action) {
doBanditAction($result->action)
} else {
doTheStatusQuo($result->variation);
}

Status quo algorithm

In order to accurately measure the performance of the bandit, we need to compare it to the status quo algorithm using an experiment. This status quo algorithm could be a complicated algorithm to that selects an action according to a different model, or a simple baseline such as selecting a fixed or random action. When you create an analysis allocation for the bandit and the returned action is null, implement the desired status quo algorithm based on the variation value.

$result = $client->getBanditAction($flagKey, $subject, $subjectContext, $actions, 'control');
if ($result->action) {
doBanditAction($result->action)
} else {
doTheStatusQuo($result->variation);
}

Debugging

You may encounter a situation where a flag assignment produces a value that you did not expect. There are functions detailed here to help you understand how flags are assigned, which will allow you to take corrective action on potential configuration issues.

[Code Placeholder: Example of using debugging functions with bandits in PHP]