Skip to content

Delivery Overview: Storage [DOC-880] #6919

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Aug 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions src/_includes/content/storage-do-include.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
{% capture title %}{{page.title}}{% endcapture %}
{% capture name %}{{page.title | replace: 'Destination', ''}}{% endcapture %}

<div class="premonition info"><div class="fa fa-info-circle"></div><div class="content"><p class="header">View observability metrics about your {{title}} with Delivery Overview</p><p markdown=1>Delivery Overview, Segment's built-in observability tool, is now in public beta for storage destinations. For more information, see the [Delivery Overview](/docs/connections/delivery-overview/) documentation.</p></div></div>
54 changes: 40 additions & 14 deletions src/connections/delivery-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,13 @@ title: Delivery Overview

Delivery Overview is a visual observability tool designed to help Segment users diagnose event delivery issues for any cloud-streaming destination receiving events from cloud-streaming sources.

> info "Delivery Overview for RETL destinations, Storage destinations, and Engage Audience Syncs currently in development"
> This means that Segment is actively developing Delivery Overview features for RETL destinations, Storage destinations, and Engage Audience syncs. Some functionality may change before Delivery Overview for these integrations becomes generally available.
> info "Delivery Overview for RETL destinations and Engage Audience Syncs currently in development"
> This means that Segment is actively developing Delivery Overview features for RETL destinations and Engage Audience syncs. Some functionality may change before Delivery Overview for these integrations becomes generally available.
>
> Delivery Overview is generally available for streaming connections (cloud-streaming sources and cloud-streaming destinations).
> Delivery Overview is generally available for streaming connections (cloud-streaming sources and cloud-streaming destinations) and in public beta for storage destinations. Some metrics specific to storage destinations, like selective syncs, failed row counts, and total rows seen, are not yet available.
> All users of Delivery Overview have access to the Event Delivery tab, and can configure delivery alerts for their destinations.


## Key features

Delivery Overview has three core features:
Expand All @@ -20,25 +21,50 @@ Delivery Overview has three core features:
You can refine these tables using the time picker and the metric toggle, located under the destination header. With the time picker, you can specify a time period (last 10 minutes, 1 hour, 24 hours, 7 days, 2 weeks, or a custom date range over the last two weeks) for which you'd like to see data. With the metric toggle, you can switch between seeing metrics represented as percentages (for example, *85% of events* or *a 133% increase in events*) or as counts (*13 events* or *an increase of 145 events*.) Delivery Overview shows percentages by default.

### Pipeline view
The pipeline view provides insights into each step your data is processed by enroute to the destination, with an emphasis on the steps where data can be discarded due to errors or your filter preferences. Each step provides details into counts, change rates, and event details (like the associated Event Type or Event Names), and the discard steps (Failed on ingest, Filtered at source, Filtered at destination, & Failed delivery) provide you with the reasons events were dropped before reaching the destination. Discard steps also include how to control or alter that outcome, when possible. The pipeline view also shows a label between the Filtered at destination and Failed delivery steps indicating how many events are currently pending retry.

The pipeline view shows the following steps:
The pipeline view provides insights into each step your data is processed by enroute to the destination, with an emphasis on the steps where data can be discarded due to errors or your filter preferences. Each step provides details into counts, change rates, and event details (like the associated Event Type or Event Names), and the discard steps (Failed on ingest, Filtered at source, Filtered at destination, & Failed delivery) provide you with the reasons events were dropped before reaching the destination. Discard steps also include how to control or alter that outcome, when possible. The pipeline view also includes a label between the Filtered at destination and Failed delivery steps indicating how many events are currently pending retry.

- **Successfully received**: Events that Segment ingested from your source
- **Failed on ingest**: Events that Segment received, but were dropped due to internal data validation rules
- **Filtered at source**: Events that were discarded due to schema settings or [Protocols](/docs/protocols/) Tracking Plans
#### Classic destinations
The pipeline view for classic destinations includes the following steps:
- **Successfully received**: Events that Segment ingested from your source.
- **Failed on ingest**: Events that Segment received, but were dropped due to internal data validation rules.
- **Filtered at source**: Events that were discarded due to schema settings or [Protocols](/docs/protocols/) Tracking Plans.
- **Filtered at destination**: Events that were discarded due to [Destination Filters](/docs/guides/filtering-data/#destination-filters), [filtering in the Integrations object](/docs/guides/filtering-data/#filtering-with-the-integrations-object), [Destination Insert functions](/docs/connections/functions/insert-functions/), or [per source schema integration filters](/docs/guides/filtering-data/#per-source-schema-integrations-filters). [Actions destinations](/docs/connections/destinations/actions/) also have a filtering capability: for example, if your Action is set to only send Identify events, all other event types will be filtered out. Actions destinations with incomplete triggers or disabled mappings are filtered out at this step. [Consent Management](/docs/privacy/consent-management/) users also see events discarded due to consent preferences.
- **Failed delivery**: Events that have been discarded due to errors or unmet destination requirements
- **Successful delivery**: Events that were successfully delivered to the destination
- **Failed delivery**: Events that have been discarded due to errors or unmet destination requirements.
- **Successful delivery**: Events that were successfully delivered to the destination.

#### Actions destinations
The pipeline view for Actions destination includes the following steps:
- **Successfully received**: Events that Segment ingested from your source.
- **Failed on ingest**: Events that Segment received, but were dropped due to internal data validation rules.
- **Filtered at source**: Events that were discarded due to schema settings or [Protocols](/docs/protocols/) Tracking Plans.
- **Mapping dropdown**: Select a [mapping](/docs/connections/destinations/actions/#customize-mappings) to filter the events in the Filtered at destination, Failed delivery and Successful delivery pipeline steps.
- **Filtered at destination**: Events that were discarded due to [Destination Filters](/docs/guides/filtering-data/#destination-filters), [filtering in the Integrations object](/docs/guides/filtering-data/#filtering-with-the-integrations-object), [Destination Insert functions](/docs/connections/functions/insert-functions/), or [per source schema integration filters](/docs/guides/filtering-data/#per-source-schema-integrations-filters). [Actions destinations](/docs/connections/destinations/actions/) also have a filtering capability: for example, if your Action is set to only send Identify events, all other event types will be filtered out. Actions destinations with incomplete triggers or disabled mappings are filtered out at this step. [Consent Management](/docs/privacy/consent-management/) users also see events discarded due to consent preferences.
- **Retry count**: The number of events currently pending retry.
- **Failed delivery**: Events that have been discarded due to errors or unmet destination requirements.
- **Successful delivery**: Events that were successfully delivered to the destination.

Actions destinations also include a mapping dropdown, which allows you to select a [mapping](/docs/connections/destinations/actions/#customize-mappings) to filter the events in the Filtered at destination, Failed delivery and Successful delivery pipeline steps. The following image shows an Actions destination filtered to include only Track Page View events in the last three pipeline steps:
The following image shows an Actions destination filtered to include only Track Page View events in the last three pipeline steps:

![A screenshot of the Delivery Overview tab for an Actions destination, with the Track Page View mapping selected.](images/delivery-overview-actions-destination.jpeg)

#### Storage destinations
The pipeline view for storage destination includes the following steps:
- **Successfully received**: Events that Segment ingested from your source.
- **Failed on ingest**: Events that Segment received, but were dropped due to internal data validation rules.
- **Filtered at source**: Events that were discarded due to schema settings or [Protocols](/docs/protocols/) Tracking Plans.
- **Filtered at destination**: Events that were discarded due to [Destination Filters](/docs/guides/filtering-data/#destination-filters), [filtering in the Integrations object](/docs/guides/filtering-data/#filtering-with-the-integrations-object), [Destination Insert functions](/docs/connections/functions/insert-functions/), or [per source schema integration filters](/docs/guides/filtering-data/#per-source-schema-integrations-filters). [Actions destinations](/docs/connections/destinations/actions/) also have a filtering capability: for example, if your Action is set to only send Identify events, all other event types will be filtered out. Actions destinations with incomplete triggers or disabled mappings are filtered out at this step. [Consent Management](/docs/privacy/consent-management/) users also see events discarded due to consent preferences.
- **Events to warehouse rows**: A read-only box that shows the point in the delivery process where Segment converts events into warehouse rows.
- **Failed to sync**: Syncs that either failed to sync or were partially successful. Selecting this step takes you to a table of all syncs with one or more failed collections. Select a sync from the table to view the discard reason, any collections that failed, the status, and the number of rows that synced for each collection. For information about common errors, see Ware
- **Successfully synced**: A record of all successful or partially successful syncs made with your destination. To view the reason a partially successfully sync was not fully successful, see the Failed to sync step.

The following image shows a storage destination with 23 partially successful syncs:

![A screenshot of the Delivery Overview tab for a Storage destination, with the Failed to sync step selected and a table of partially successful syncs.](images/delivery-overview-storage-destinations.png)

### Breakdown table
The breakdown table provides you with greater detail about the selected events.

To open the breakdown table, select either the first step in the pipeline view (Successfully received,) the last step in the pipeline view (Successful delivery,) or select a discard step and then click on a discard reason.
To open the breakdown table, select either the first step in the pipeline view, the last step in the pipeline view, or select a discard step and then click on a discard reason.

The breakdown table displays the following details:
- **Event type**: The Segment spec event type (Track call vs. Identify call, for example)
Expand Down Expand Up @@ -96,7 +122,7 @@ You can use the Event Delivery alerting features (Delivery Alerts) by selecting

Note that this is dependent on your [notification settings](/docs/segment-app/#segment-settings). For example, if the threshold is set to 99%, then you'll be notified each time less than 100% of events fail.

You can also use Connections Alerting, a feature that allows Segment users to receive in-app, email, and Slack notifications related to the performance and throughput of an event-streaming connection.
You can also use [Connections Alerting](/docs/connections/alerting), a feature that allows Segment users to receive in-app, email, and Slack notifications related to the performance and throughput of an event-streaming connection.

Connections Alerting allows you to create two different alerts:
- **Source volume alerts**: These alerts notify you if your source ingests an abnormally small or large amount of data. For example, if you set a change percentage of 4%, you would be notified when your source ingests less than 96% or more than 104% of the typical event volume.
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 3 additions & 1 deletion src/connections/storage/catalog/aws-s3/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,14 +11,16 @@ The AWS S3 destination provides a more secure method of connecting to your S3 bu

Functionally, the two destinations (Amazon S3 and AWS S3 with IAM Role Support) copy data in a similar manner.

## Getting Started
## Getting started

The AWS S3 destination puts the raw logs of the data Segment receives into your S3 bucket, encrypted, no matter what region the bucket is in.

AWS S3 works differently than most destinations. Using a destinations selector like the [integrations object](/docs/connections/spec/common/#integrations) does not affect events with AWS S3.

The Segment Tracking API processes data from your sources and collects the Events in batches. Segment then uploads the batches to a secure Segment S3 bucket, from which they're securely copied to your own S3 bucket in small bursts. Individual files won't exceed 100 MB in size.

{% include content/storage-do-include.md %}

{% comment %}

![Diagram showing how data is transferred from Segment Tracking API to a customer's AWS S3 bucket.](images/s3processdiagram.png)
Expand Down
2 changes: 2 additions & 0 deletions src/connections/storage/catalog/azuresqldw/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,8 @@ redirect_from:

Azure's [Azure Synapse Analytics](https://azure.microsoft.com/en-us/services/synapse-analytics/){:target="_blank"}, previously known as Azure SQL Data Warehouse, is a limitless analytics service that brings together enterprise data warehousing and Big Data analytics.

{% include content/storage-do-include.md %}

## Getting Started

Complete the following prerequisites in Microsoft Azure before connecting your Azure Synapse Analytics databases to Segment:
Expand Down
2 changes: 2 additions & 0 deletions src/connections/storage/catalog/bigquery/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,8 @@ Google AdWords into a BigQuery data warehouse. When you integrate BigQuery with
The Segment warehouse connector runs a periodic ETL (Extract - Transform - Load) process to pull raw events and objects from your sources and load them into your BigQuery cluster.
For more information about the ETL process, including how it works and common ETL use cases, refer to [Google Cloud's ETL documentation](https://cloud.google.com/learn/what-is-etl){:target="_blank"}.

{% include content/storage-do-include.md %}

## Getting Started

To store your Segment data in BigQuery, complete the following steps:
Expand Down
1 change: 1 addition & 0 deletions src/connections/storage/catalog/databricks/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,3 +90,4 @@ Segment uses the service principal to access your Databricks workspace and assoc

Once connected, you'll see a confirmation screen with next steps and more info on using your warehouse.

{% include content/storage-do-include.md %}
4 changes: 3 additions & 1 deletion src/connections/storage/catalog/db2/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ all of your event and Cloud Source data in a warehouse built by IBM. This
guide will walk through what you need to know to get up and running with Db2
Warehouse and Segment.

> note " "
> info " "
> This document refers specifically to [IBM Db2 Warehouse on Cloud](https://www.ibm.com/cloud/db2-warehouse-on-cloud){:target="_blank"}, [IBM Db2 Warehouse](https://www.ibm.com/analytics/db2){:target="_blank"}, and the [IBM Integrated Analytics System](https://www.ibm.com/products/integrated-analytics-system){:target="_blank"}. For questions related to any of these products, see the [IBM Cloud Docs](https://cloud.ibm.com/docs){:target="_blank"}.

## Getting Started
Expand All @@ -21,6 +21,8 @@ To get started, you'll need to:
2. [Grant the user sufficient permissions](#grant-the-segment-user-permissions).
3. [Create the the IBM Db2 Destination in the Segment app](#create-segment-db2-destination).

{% include content/storage-do-include.md %}

### Create a User for Segment

In order to connect your IBM Db2 warehouse to Segment, you need to create a Db2 user account that Segment can assume. To create a user account for Segment:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@ integration-type: destination
redirect_from: '/connections/destinations/catalog/google-cloud-storage/'
---



The Google Cloud Storage (GCS) destination puts the raw logs of the data Segment receives into your GCS bucket. The data is copied into your bucket at least every hour. You might see multiple files over a period of time depending on how much data is copied.

> warning ""
Expand All @@ -20,7 +18,6 @@ The Google Cloud Storage (GCS) destination puts the raw logs of the data Segment
1. Create a Service Account to allow Segment to copy files into the bucket
2. Create a bucket in your preferred region.


## Set up Service Account to give Segment access to upload to your Bucket

1. Go to http://cloud.google.com/iam
Expand Down
4 changes: 3 additions & 1 deletion src/connections/storage/catalog/postgres/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,14 +11,16 @@ PostgreSQL, or Postgres, is an object-relational database management system (ORD

PostgreSQL is ACID-compliant and transactional. PostgreSQL has updatable views and materialized views, triggers, foreign keys; supports functions and stored procedures, and other expandability. Developed by the PostgreSQL Global Development Group, free and open-source.

> note "Segment sources required"
> info "Segment sources required"
> In order to add a Postgres destination to Segment, you must first add a source. To learn more about sources in Segment, check out the [Sources Overview](/docs/connections/sources) documentation.

## Getting started
Segment supports the following Postgres database providers:
- [Heroku](#heroku-postgres)
- [RDS](#rds-postgres)

{% include content/storage-do-include.md %}

Segment supported a third Postgres provider, Compose, until Compose was [was deprecated on March 1, 2023](https://help.compose.com/docs/compose-deprecation){:target="_blank"}. To continue sending your Segment data to a Postgres destination, consider using either [Heroku Postgres](#heroku-postgres) or [Amazon's Relational Database Service](#rds-postgres).

> warning ""
Expand Down
2 changes: 2 additions & 0 deletions src/connections/storage/catalog/redshift/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@ Complete the following steps to provision your Redshift cluster, and connect Seg
3. [Create a database user](#create-a-database-user)
4. [Connect Redshift to Segment](#connect-redshift-to-segment)

{% include content/storage-do-include.md %}

## Choose the best instance for your needs

While the number of events (database records) are important, the storage capacity usage of your cluster depends primarily on the number of unique tables and columns created in the cluster. Keep in mind that each unique `.track()` event creates a new table, and each property sent creates a new column in that table. To avoid storing unnecessary data, start with a detailed [tracking plan](/docs/protocols/tracking-plan/create/) before you install Segment libraries to ensure that only the necessary events are passed to Segment.
Expand Down
2 changes: 2 additions & 0 deletions src/connections/storage/catalog/snowflake/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ There are six steps to get started using Snowflake with Segment.
5. [Test the user and credentials](#step-5-test-the-user-and-credentials)
6. [Connect Snowflake to Segment](#step-6-connect-snowflake-to-segment)

{% include content/storage-do-include.md %}

### Prerequisites

To set up the virtual warehouse, database, role, and user in Snowflake for Segment's Snowflake destination, you must have the `ACCOUNTADMIN` role, or, a custom role with the following [Snowflake privileges](https://docs.snowflake.com/en/user-guide/security-access-control-overview#label-access-control-overview-privileges){:target="_blank"}:
Expand Down