Loading

Add a data stream

A data stream is a logical sub-division of an integration package, dealing with a specific observable aspect of the service or product being observed. For example, the Apache integration has three data streams, each represented by a separate folder of assets in the data_stream directory:

apache
└───data_stream
│   └───access
│   └───error
│   └───status
Admonition

Data streams allow you to store time series data across multiple indices while giving you a single named resource for requests.

A data stream defines multiple Elasticsearch assets, like index templates, ingest pipelines, and field definitions. These assets are loaded into Elasticsearch when a user installs an integration using the Fleet UI in Kibana.

A data stream also defines a policy template. Policy templates include variables that allow users to configure the data stream using the Fleet UI in Kibana. Then, the Elastic Agent interprets the resulting policy to collect relevant information from the product or service being observed. Policy templates can also define an integration’s supported deployment_modes.

See data streams for more information.

  1. Boostrap a new data stream

In your package directory, run:

elastic-package create data-stream

Follow the prompts to set the name, title, and type (logs, metrics, etc.) for the data stream. Repeat this command for each new data stream you want to add.

  1. Configure the data stream

After bootstrapping, manually adjust the generated files to suit your use case:

  • Define required variables: In the policy template, specify variables that users can configure (e.g., paths, ports, log levels).
  • Define used fields: Edit the fields/ files to describe the structure and types of data your stream will ingest.
  • Define ingest pipeline definitions: If needed, create or update ingest pipelines to parse, enrich, or transform incoming data before it’s indexed.
  • Update the Elastic Agent stream configuration: Ensure the Elastic Agent’s stream configuration matches your data collection requirements and references the correct variables and pipelines.
  1. How data streams are used
  • When the integration is installed, each data stream is registered in Elasticsearch as a managed, time-based resource.
  • Data sent to the data stream is automatically routed to the correct backing indices, with lifecycle management (rollover, retention) handled by Elasticsearch.
  • Users can query, visualize, and analyze data from each stream in Kibana, using the single data stream name (e.g., logs-apache.access).