Client count

Anything that connects and authenticates to Vault to accomplish a task is a client. For example, a user logging into a cluster to manage policies or a machine-based system (application or cloud service) requesting a database token are both considered clients.

While there are many different potential clients, the most common are:

  1. Human users interacting directly with Vault.

  2. Applications and microservices.

  3. Servers and platforms like VMs, Docker containers, or Kubernetes pods.

  4. Orchestrators like Nomad, Terraform, Ansible, ACME, and other continuous integration / continuous delivery (CI/CD) pipelines.

  5. Vault agents and proxies that act on behalf of an application or microservice.

Identity and entity assignment

Authorized clients can connect to Vault with a variety of authentication methods.

Authorization sourceAuthN method

Externally managed or SSO

Active Directory, LDAP, OIDC, JWT, GitHub, username+password

Platform- or server-based

Kubernetes, AWS, GCP, Azure, Cert, Cloud Foundry

Self

AppRole, tokens with no associated authN path or role

When a client authenticates, Vault assigns a unique identifier (client entity) in the Vault identity system based on the authentication method used or a previously assigned alias. Entity aliases let clients authenticate with multiple methods but still be associated with a single policy, share resources, and count as the same entity, regardless of the authentication method used for a particular session.

Standard entity assignments

Each authentication method has a unique ID string that corresponds to a client entity used for telemetry. For example, a microservice authenticating with AppRole takes the associated role ID as the entity. If you are running at scale and have multiple copies of the microservices using the same role id, the full set of instances will share the same identifier.

As a result, it is critical that you configure different clients (microservices, humans, applications, services, platforms, servers, or pipelines) in a way that results in distinct clients having unique identifiers. For example, the role IDs should be different between two microservices, MicroserviceA and MicroServiceB, even if the specific instances of MicroServiceA and MicroServiceB share a common role ID.

Entity assignment with ACME

Vault treats all ACME connections that authenticate under the same certificate identifier (domain) as the same certificate entity for client count calculations.

For example:

  • ACME client requests (from the same server or separate servers) for the same certificate identifier (a unique combination of CN,DNS, SANS and IP SANS) are treated as the same entity.

  • If an ACME client makes a request for a.test.com, and subsequently makes a new request for b.test.com and *.test.com then two distinct entities will be created, one for a.test.com and another for the combination of b.test.com and *.test.com.

  • Overlap of certificate identifiers from different ACME clients will be treated as the same entity e.g. if client 1 requests a.test.com and client 2 requests a.test.com a single entity is created for both requests.

Secret sync clients

Vault can automatically update secrets in external destinations with secret sync. A secret that gets synced to one or more destinations is considered a secret sync client for client count calculations.

Note that:

  • Each synced secret is counted distinctly based on the path and namespace of the secret. If you have secrets at path kv1/secret and kv2/secret which are both synced, then two distinct secret syncs will be counted.

  • A secret can be synced to multiple different destinations, and it will still only be counted as one secret sync. If kv/secret is synced to both Azure Key Vault and AWS Secret Manager, this will be counted as only one secret sync client.

  • Secret sync clients are only created after you create an association between a secret and a store. If you create kv/secret and do not associate this secret with any destinations, it will not be counted as a secret sync client.

  • Secret syncs clients are registered in Vault's client counting system so long as the sync is active. If you create kv/secret and associate it with a destination in January, update the secret in May, and then delete the secret in September, Vault will consider this client as having been seen throughout the entire period of January through September.

Entity assignment with namespaces

A namespace represents a isolated, logical space within a single Vault cluster and is typically used for administrative purposes.

When a client authenticates within a given namespace, Vault assigns the same client entity to activities within any child namespaces because the namespaces exist within the same larger scope.

When a client authenticates across namespace boundaries, Vault treats the single client as two distinct entities because the client is operating across different scopes with different policy assignments and resources.

For example:

  • Different requests under parent and child namespaces from a single client authenticated under the parent namespace are assigned the same entity ID. All the client activities occur within the boundaries of the namespace referenced in the original authentication request.

  • Different requests under parent and child namespaces from a single client authenticated under the child namespace are assigned different entity IDs. Some of the client activities occur outside the boundaries of the namespace referenced in the original authentication request.

  • Requests by the same client to two different namespaces, NAMESPACEA and NAMESPACEB are assigned different entity IDs.

Entity assignment with non-entity tokens

Vault uses tokens as the core method for authentication. You can use tokens to authenticate directly, or use token auth methods to dynamically generate tokens based on external identities.

When clients authenticate with the token auth method without a client identity, the result is a non-entity token. For example, a service might use the token authentication method to create a token for a user whose explicit identity is unknown.

Ultimately, non-entity tokens trace back to a particular client or purpose so Vault assigns unique entity IDs to non-entity tokens based on a combination of the:

  • assigned entity alias name (if present),

  • associated policies, and

  • namespace under which the token was created.

In rare cases, tokens may be created outside of the Vault identity system without an associated entity or identity. Vault treats every unaffiliated token as a unique client for production usage. We strongly discourage the use of unaffiliated tokens and recommend that you always associate a token with an entity alias and token role.

All non-entity tokens with the same namespace and policy assignments are treated as the same client entity.

Client count calculation

Vault provides usage telemetry for the number of clients based on the number of unique entity assignments within a Vault cluster over a given billing period:

  • Standard entity assignments based on authentication method for active entities.

  • Constructed entity assignments for active non-entity tokens, including batch tokens created by performance standby nodes.

  • Certificate entity assignments for ACME connections.

  • Secrets being synced to at least one sync destination.

CLIENT_COUNT_PER_CLUSTER = UNIQUE_STANDARD_ENTITIES +
                           UNIQUE_CONSTRUCTED_ENTITIES +
                           UNIQUE_CERTIFICATE_ENTITIES +
                           UNIQUE_SYNCED_SECRETS

Vault does not aggregate or de-duplicate clients across clusters, but all logs and precomputed reports are included in DR replication.

Vault currently rolls certificate entities into the non-entity client count in the UI and API query requests. For more detailed information on certificate entities, use the internal counter endpoint to query monthly data for the PKI mount path or export historic data and look for client_type=pki-acme.

How Vault tracks clients

Each time a client authenticates, Vault checks whether the corresponding entity ID has already been recorded in the client log as active for the current month:

  • If no record exists, Vault adds an entry for the entity ID.

  • If a record exists but the entity was last active prior to the current month, Vault adds a new entry to the client record for the entity ID.

  • If a record exists and the entity was last active within the current month, Vault does not add a new entry to the client record for the entity ID.

For example:

  • Two non-entity tokens under the same namespace, with the same alias name and policy assignment receive the same entity assignment and are only counted once.

  • Two authentication requests from a single ACME client for the same certificate identifiers from different mounts receive the same entity assignments and are counted once.

  • An application authenticating with AppRole receive the same entity assignment every time and only counted once.

At the end of each month, Vault pre-computes reports for each cluster on the number of active entities, per namespace, for each time period within the configured retention period. By de-duplicating records from the current month against records for the previous month, Vault ensures entities that remain active within every calendar month are only counted once for the year.

The deduplication process has two additional consequences:

  1. Detailed reporting lags by 1 month at the start of the billing period.

  2. Billing period reports that include the current month must use an approximation for the number of new clients in the current month.

How Vault approximates current-month client count

Vault approximates client count for the current month using a hyperloglog algorithm that looks at the difference between the cardinalities of:

  • the number of clients across the entire billing period, and

  • the number of clients across the billing period excluding clients from the current month.

The approximation algorithm uses the axiomhq library with fourteen registers and sparse representations (when applicable). The multiset for the calculation is the total number of clients within a billing period, and the accuracy estimate for the approximation decreases as the difference between the number of clients in the current month and the number of clients in the billing period increases.

Testing verification for client count approximations

Given CM as the number of clients for the current month and BP as the number of clients in the billing period, we found that the approximation becomes increasingly imprecise as:

  • the difference between BC and CM increases

  • the value of CM approaches zero.

  • the number of months in the billing period increase.

The maximum observed error rate (ER = (FOUND_NEW_CLIENTS / EXPECTED_NEW_CLIENTS)) was 30% for 10,000 clients or less, with an error rate of 5 – 10% in the average case.

For the purposes of predictive analysis, the following tables list a random sample the values we found during testing for CM, BP, and ER.

Current month (CM)

Billing period (BP)

Error rate (ER)

7

10

0%

20

600

0%

20

1000

0%

20

6000

10%

20

10000

10%

200

600

0%

200

10000

7%

400

6000

5%

2000

10000

4%

Current month (CM)

Billing period (BP)

Error rate (ER)

20

15

0%

20

100

0%

20

1000

0%

20

10000

30%

200

10000

6%

2000

10000

2%

Resource costs for client computation

In addition to the storage used for storing the pre-computed reports, each active entity in the client log consumes a few bytes of storage. As a safety measure against runaway storage growth, Vault limits the number of entity records to 656,000 per month, but typical storage costs are much less.

On average, 1000 monthly active entities requires 1.5 MiB of storage capacity over the default 24-month retention period.

Last updated