You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
> Linked Audiences (with Data Graph, Linked Events) is in public beta, and Segment is actively working on this feature. Some functionality may change before it becomes generally available.
8
+
On this page, you'll learn how to connect your Databricks data warehouse to Segment for the [Data Graph](/docs/unify/data-graph/data-graph/).
12
9
13
-
On this page, you'll learn how to connect your Databricks data warehouse to the Segment Data Graph.
10
+
## Databricks credentials
14
11
15
-
## Set up Databricks credentials
12
+
Segment assumes that you already have a workspace that includes the datasets you'd like to use for the Data Graph. Sign in to Databricks with admin permissions to create new resources and provide the Data Graph with the necessary permissions.
16
13
17
-
Sign in to Databricks with admin permissions to create new resources and provide the Data Graph with the necessary permissions.
14
+
## Step 1: Create a new Service Principal user
15
+
Segment recommends setting up a new Service Principal user and only giving this user permissions to access the required catalogs and schemas.
18
16
19
-
Segment assumes that you already have a workspace that includes the datasets you'd like to use for the Data Graph. Segment recommends setting up a new Service Principal user with only the permissions to access the required catalogs and schemas.
17
+
If you already have a Service Principal user you'd like to use, grant it "Can use" permissions for your data warehouse and proceed to [Step 2](#step-2-create-a-catalog-for-segment-to-store-checkpoint-tables).
20
18
21
-
### Step 1: Set up a Service Principal user
22
-
23
-
Segment recommends that you set up a new Service Principal user. If you already have a Service Principal user you'd like to use, grant it "Can use" permissions for your data warehouse and proceed to [Step 2: Create a catalog for Segment to store checkpoint tables](#step-2-create-a-catalog-for-segment-to-store-checkpoint-tables).
24
-
25
-
If you want to create a new Service Principal user, complete the following substeps:
26
-
27
-
#### Substep 1: Create a new Service Principal user
19
+
### 1a) Create a new Service Principal user
28
20
1. Log in to the Databricks UI as an Admin.
29
21
2. Click **User Management**.
30
22
3. Select the **Service principals** tab.
@@ -36,74 +28,63 @@ If you want to create a new Service Principal user, complete the following subst
36
28
9. Select the “Permissions” tab and click **Add Permissions**.
37
29
10. Add the newly created Service Principal user and click **Save**.
38
30
39
-
> success ""
40
-
> If you already have a warehouse you'd like to use, you can move on to the next substep, [Substep 2: Add your Service Principal user to Warehouse User Lists](#substep-2-add-your-service-principal-user-to-warehouse-user-lists). If you need to create a new warehouse first, see the [Create a new warehouse](#create-a-new-warehouse) before completing the next substep.
41
-
42
-
#### Substep 2: Add your Service Principal user to Warehouse User Lists
31
+
### 1b) Add your Service Principal user to Warehouse User Lists
43
32
1. Log in to the Databricks UI as an Admin.
44
33
2. Navigate to SQL Warehouses.
45
34
3. Select your warehouse and click **Permissions**.
46
35
4. Add the Service Principal user and grant them “Can use” access.
47
36
5. Click **Add**.
48
37
49
-
##### (Optional) Confirm Service Principal permissions
50
-
Confirm that the Service Principal user that you're using to connect to Segment has "Can use" permissions for your warehouse.
51
-
52
-
To confirm that your Service Principal user has "Can use" permission:
53
-
1. In the Databricks console, navigate to SQL Warehouses and select your warehouse.
54
-
2. Navigate to Overview and click **Permissions**.
55
-
3. Verify that the Service Principal user has "Can use" permission.
38
+
## Step 2: Create a catalog for Segment to store checkpoint tables
56
39
57
-
### Step 2: Create a catalog for Segment to store checkpoint tables
40
+
**Segment requires write access to this catalog for internal bookkeeping and to store checkpoint tables for the queries that are executed. Therefore, Segment recommends creating a new catalog for this purpose.** This is also the catalog you'll be required to specify when connecting Databricks with the Segment app.
58
41
59
-
> warning "Segment recommends creating an empty catalog for the Data Graph"
60
-
> If you plan to use an existing catalog with Reverse ETL, follow the instructions in the [Update user access for Segment Reverse ETL catalog](#update-user-access-for-segment-reverse-etl-catalog) section.
61
-
62
-
Segment requires write access to a catalog to create a schema for internal bookkeeping, and to store checkpoint tables for the queries that are executed.
42
+
> info ""
43
+
> Segment recommends creating a new database for the Data Graph.
44
+
> If you choose to use an existing database that has also been used for [Segment Reverse ETL](/docs/connections/reverse-etl/), you must follow the [additional instructions](#update-user-access-for-segment-reverse-etl-catalog) to update user access for the Segment Reverse ETL catalog.
63
45
64
-
Segment recommends creating an empty catalog for this purpose by running the following SQL. This is also the catalog that you'll be required to specify when setting up your Databricks integration in the Segment app.
65
-
66
-
```sql
46
+
```SQL
67
47
CREATE CATALOG IF NOT EXISTS `SEGMENT_LINKED_PROFILES_DB`;
68
-
-- Copy the Client ID by clicking “Generate secret” for the Service Principal user
48
+
-- Copy the saved Client ID from previously generated secret
69
49
GRANT USAGE ON CATALOG `SEGMENT_LINKED_PROFILES_DB` TO `${client_id}`;
70
50
GRANT CREATE ON CATALOG `SEGMENT_LINKED_PROFILES_DB` TO `${client_id}`;
71
51
GRANTSELECTON CATALOG `SEGMENT_LINKED_PROFILES_DB` TO `${client_id}`;
72
52
```
73
53
74
-
###Step 3: Grant read-only access to the Profiles Sync catalog
54
+
## Step 3: Grant read-only access to the Profiles Sync catalog
75
55
76
56
Run the following SQL to grant the Data Graph read-only access to the Profiles Sync catalog:
77
57
78
-
```sql
58
+
```SQL
79
59
GRANT USAGE, SELECT, USE SCHEMA ON CATALOG `${profiles_sync_catalog}` TO `${client_id}`;
80
60
```
81
61
82
-
###Step 4: Grant read-only access to additional catalogs for the Data Graph
83
-
Run the following SQL to grant your Service Principal user read-only access to any additional catalogs you want to use for the Data Graph:
62
+
## Step 4: Grant read-only access to additional catalogs for the Data Graph
63
+
Run the following SQL to grant your Service Principal user read-only access to any additional catalogs you want to use for the Data Graph.
84
64
85
-
```sql
86
-
--Run this command for each catalog you want to use for the Segment Data Graph
65
+
```SQL
66
+
--********** REPEAT THIS COMMAND FOR EACH CATALOG YOU WANT TO USE FOR THE DATA GRAPH **********
87
67
GRANT USAGE, SELECT, USE SCHEMA ON CATALOG `${catalog}` TO `${client_id}`;
88
68
```
89
69
90
-
### (Optional) Restrict read-only access to schemas
70
+
## (Optional) Step 5: Restrict read-only access
71
+
72
+
### Restrict read-only access to schemas
91
73
92
74
Restrict access to specific schemas by running the following SQL:
93
75
94
-
```sql
76
+
```SQL
95
77
GRANT USAGE ON CATALOG `${catalog}` TO `${client_id}`;
96
78
USE CATALOG `${catalog}`;
97
79
GRANT USAGE, SELECTON SCHEMA `${schema_1}` TO `${client_id}`;
98
80
GRANT USAGE, SELECTON SCHEMA `${schema_2}` TO `${client_id}`;
99
81
...
100
82
101
83
```
102
-
103
-
### (Optional) Restrict read-only access to tables
84
+
### Restrict read-only access to tables
104
85
Restrict access to specific tables by running the following SQL:
105
86
106
-
```sql
87
+
```SQL
107
88
GRANT USAGE ON CATALOG `${catalog}` TO `${client_id}`;
108
89
USE CATALOG `${catalog}`;
109
90
GRANT USAGE ON SCHEMA `${schema_1}` TO `${client_id}`;
@@ -114,39 +95,39 @@ GRANT SELECT ON TABLE `${table_2}` TO `${client_id}`;
114
95
115
96
```
116
97
117
-
###Step 5: Validate the permissions of your Service Principal user
98
+
## Step 6: Validate the permissions of your Service Principal user
118
99
119
-
Sign in to the [Databricks CLI with your Client ID secret](https://docs.databricks.com/en/dev-tools/cli/authentication.html#oauth-machine-to-machine-m2m-authentication){:target="_blank”} and run the following SQL to verify the Service Principal user has the correct permissions for a given table.
100
+
Sign in to the [Databricks CLI with your Client ID secret](https://docs.databricks.com/en/dev-tools/cli/authentication.html#oauth-machine-to-machine-m2m-authentication){:target="_blank"} and run the following SQL to verify the Service Principal user has the correct permissions for a given table.
120
101
121
102
> success ""
122
103
> If this command succeeds, you can view the table.
123
104
124
-
```sql
105
+
```SQL
125
106
USE DATABASE ${linked_read_only_database} ;
126
107
SHOW SCHEMAS;
127
108
SELECT*FROM ${schema}.${table} LIMIT10;
128
109
```
129
110
130
-
###Step 6: Connect your warehouse to Segment
111
+
## Step 7: Connect your warehouse to Segment
131
112
132
-
Segment requires the following settings to connect to your Databricks warehouse. You can find these details in your Databricks workspace by navigating to **SQL Warehouse > Connection details**.
113
+
To connect your warehouseto the Data Graph:
133
114
115
+
1. Navigate to **Unify > Data Graph**. This should be a Unify space with Profiles Sync already set up.
116
+
2. Click Connect warehouse.
117
+
3. Select Databricks as your warehouse type.
118
+
4. Enter your warehouse credentials. You can find these details in your Databricks workspace by navigating to **SQL Warehouse > Connection details**. Segment requires the following settings to connect to your Databricks warehouse:
134
119
-**Hostname**: The address of your Databricks server
135
120
-**Http Path**: The address of your Databricks compute resources
136
-
-**Port**: The port used to connect to your Databricks warehouse. The default port is 443, but your port might be different.
137
-
-**Catalog**: The catalog you designated in [Step 2: Create a catalog for Segment to store checkpoint tables](#step-2-create-a-catalog-for-segment-to-store-checkpoint-tables)
121
+
-**Port**: The port used to connect to your Databricks warehouse. The default port is 443, but your port might be different
122
+
-**Catalog**: The catalog you designated in [Step 2](#step-2-create-a-catalog-for-segment-to-store-checkpoint-tables)
138
123
-**Service principal client ID**: The client ID used to access to your Databricks warehouse
139
124
-**OAuth secret**: The OAuth secret used to connect to your Databricks warehouse
140
125
141
-
After identifying the following settings, continue setting up the Data Graph by following the instructions in [Connect your warehouse to the Data Graph](/docs/unify/data-graph/data-graph/#step-2-connect-your-warehouse-to-the-data-graph).
142
-
143
-
## Additional set up for warehouse permissions
144
-
145
-
### Update user access for Segment Reverse ETL catalog
146
-
Run the following SQL if you run into an error on the Segment app indicating that the user doesn’t have sufficient privileges on an existing `_segment_reverse_etl` schema.
126
+
5. Test your connection, then click Save.
147
127
148
-
If Segment Reverse ETL has ever run in the catalog you are configuring as the Segment connection catalog, a Segment-managed schema is already created and you need to provide the new Segment user access to the existing schema. Update the Databricks table permissions by running the following SQL:
128
+
## Update user access for Segment Reverse ETL catalog
129
+
If Segment Reverse ETL has ever run in the catalog you are configuring as the Segment connection catalog, a Segment-managed schema is already created and you need to provide the new Segment user access to the existing catalog. Run the following SQL if you run into an error on the Segment app indicating that the user doesn’t have sufficient privileges on an existing `_segment_reverse_etl` catalog.
149
130
150
-
```sql
131
+
```SQL
151
132
GRANT ALL PRIVILEGES ON SCHEMA ${segment_internal_catalog}.__segment_reverse_etl TO `${client_id}`;
0 commit comments