This repository contains an experimental template for a remote MLFlow Tracking Server with support for OpenID Connect (OIDC) and User-Managed Access (UMA).
When a MLFlow Tracking server must be exposed through the internet, security measures must be taken to restrict access, and control permissions when dealing with multiple users. The built-in support for authentication is experimental, and currently lacks some desirable control policies. As such, this repository is an experimental template using Keycloak and NGINX, to provide enhanced capabilities for user management, and remote access to a self-hosted MLFlow tracking server.
Docker
anddocker-compose
;- Python 3.10+ (local execution);
- The first step is to start all the servers running
docker-compose up
; - Once everything is up and running, you should be able to access the following services:
- MinIO at http://localhost:9001 (user:
root
, password:rootroot
); - Keycloak at http://localhost:8083 (user:
admin
, password:admin
); - NGINX proxy to MLFlow tracking server at http://localhost:8000 (the browser should indicate a
400 Bad Request
);
- MinIO at http://localhost:9001 (user:
- The next step is to define a sample user in Keycloak: the user should be on the realm
mlflow
and must have a role mapping for eitherDefault
orAdmin
;- The
mlflow
realm comes pre-configured with roles, permissions and resources, you might want to take a deeper look into the realm configuration; - As a quick note,
Admin
users are allowed to the delete endpoints of MLFlow, while the others are not; - When a
Default
user tries to run a delete operation, the server returns a403 Forbidden
;
- The
- Once you have at least a user defined in the
mlflow
realm on Keycloak, you should be able to access the MLFlow tracking server UI and run experiments; - Start by installing the requirements with your preferred manager (e.g.,
uv pip install -r requirements.txt
);- It is advisable to create a virtual environment instead of installing packages system-wide;
- Once everything is installed, access the MLFlow interface by running
python -m mlflow_app.ui
;- You will be prompted to enter you username and password defined on Keycloak;
- The interface will be available at http://localhost:5000;
- The mlflow_app.ui creates a simple local proxy using
FastAPI
which takes care of sending your user credentials (i.e., access token) to NGINX; - The OIDC client has created a
credentials.json
file which contains your tokens. This information is sensitive and should be stored carefully;
- Now that you can access the MLFlow UI using a secure token, you can run a sample experiment with
python -m mlflow_app.sample
;- The code should run quite fast;
- You should note that the sample uses the
auto_refresh.TokenAutoRefresh
context manager, which takes care of updating the OIDC tokens alongside the run (otherwise run duration would be limited to the access token lifetime);
Warning
The sample code and architecture shouldn't be deployed in any testing/production environment. Additional measures and configuration are necessary. This template could be used as a starting point.
The following non-exhaustive list contains some considerations beyond the template setup which must be addressed if you plan on deploying a similar architecture:
- Use HTTPS and configure both Keycloak and NGINX with SSL certificates;
- The sample simplifies this process by forcing HTTP (non-secure);
- Use PKCE on OIDC client;
- The sample has a simplified OIDC client/RP implementation which doesn't use PKCE, and only provides the Authorization Code Flow;
- Validate JWTs and use JWKS;
- The sample OIDC client doesn't perform any validation on the returned tokens;
- To access the MLFlow UI in this architecture each end-user is required to run a local proxy which sets the appropriate authorization headers;
- This might be a hardship to setup, and debug across platforms;
- MLFlow doesn't have a built-in notion of user;
- MLFlow has a
User
tag that reads the current user from the system; - If the OIDC user is also important to be logged, the
mlflow_app.utils.set_oidc_user
should be called on the beginning of each run (see the samples); - In any case, this only logs the user for the run and can be easily changed;
- It is more robust that either NGINX or Keycloak log user actions (e.g., endpoint calls, authorization requests);
- MLFlow has a
- Maintenance is required to keep this setup updated and running;
- The architecture uses multiple proxies (i.e., NGINX, local proxy) and custom libraries (e.g., OIDC client) to make everything work;
- If MLFlow or any of the dependencies change, additional work is required and each end-user must update their library;
- A much simpler architecture would be to use a SSH tunnel to directly access the MLFlow server;
- However, an identity-aware proxy would still be required to provide access control;
- Multiprocessing and multithreading safety;
- The OIDC client isn't inherently thread-safe and depends on Python's GIL;
The diagram below gives an overview of the architecture.
---
title: Components
---
flowchart TD
subgraph Client Code
src_code(MLFlow Code) -->|"Configure Bearer Token"| oidc_client(OIDC Client)
end
subgraph NGINX
auth["Request Intercept"]
end
src_code <-->|"Tracking API"| NGINX
oidc_client -->|"Get Tokens"| keycloak[Keycloak]
auth -->|"Check User & Permissions"| keycloak
auth --> |"If User Allowed"| mlflow_server[MLFlow Tracking Server]
keycloak --> database[Relational Database]
mlflow_server --> object_storage[Object Storage]
mlflow_server --> database
Client Code
: refers to the code that interacts with the remote MLFlow server;MLFlow Code
: code that interacts with themlflow
API (e.g., Python, Scala, etc);OIDC Client
: small library that allows user to login to an Identity Provider (e.g., Keycloak);
Keycloak
: identity provider;NGINX
: identity-aware reverse proxy;Request Intercept
: module that provides the "identity-aware" capability (i.e., checks with the Identity Provider whether the user is valid and has sufficient permission to the resource);
MLFlow Tracking Server
: remote tracking server;Object Storage
: remote object storage for models, artifacts, etc;Relational Database
: database for MLFlow (e.g., experiments, run, metadata) and Keycloak (e.g., realms, users, tokens);
- mlflow-oidc-auth: MLFlow auth plugin to use OpenID Connect (OIDC) as authentication and authorization provider;