Skip to content

Nexus: worker, workflow-backed operations, and workflow caller #813

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 94 commits into
base: main
Choose a base branch
from

Conversation

dandavison
Copy link
Contributor

@dandavison dandavison commented Apr 2, 2025

Initial Python Temporal Nexus implementation.

Temporal SDK PR to accompany

Contains Nexus worker, components for users to define workflow-backed Nexus operations, and the ability to start and cancel a Nexus operation from a workflow.

Notes for reviewers

  • nexusrpc.handler.start_workflow is a top-level static function, but currently there's still a public contextvar object nexusrpc.handler.temporal_operation_context. Let's settle on approach there. If we're using module-level getters I think they need to be named something like get_client() etc, despite the fact that we have activity.metric_meter() ? Little bit more discussion needed there.
  • The workflow caller is in there, but needs more cleaning up. Feel free to ignore those files for now.
  • I'll do another cleanup pass on docstrings, API docs, etc.

@dandavison dandavison force-pushed the nexus branch 2 times, most recently from a845a6f to 7a121bb Compare April 5, 2025 15:08
@dandavison dandavison force-pushed the nexus branch 10 times, most recently from 862e4f9 to 8ac0192 Compare April 17, 2025 12:17
@dandavison dandavison force-pushed the nexus branch 4 times, most recently from 8940d51 to 520aecf Compare April 26, 2025 16:16
@dandavison dandavison force-pushed the nexus branch 2 times, most recently from 2fc971f to 8bd6011 Compare April 29, 2025 13:11
@dandavison dandavison changed the title Nexus prototype Nexus May 8, 2025
@dandavison dandavison force-pushed the nexus branch 2 times, most recently from 5d54f48 to 8ddfb94 Compare May 24, 2025 01:49
@dandavison dandavison force-pushed the nexus branch 7 times, most recently from fa3e8ec to f7bf47b Compare May 27, 2025 20:34
@@ -94,6 +94,7 @@ informal introduction to the features and their implementation.
- [Heartbeating and Cancellation](#heartbeating-and-cancellation)
- [Worker Shutdown](#worker-shutdown)
- [Testing](#testing-1)
- [Nexus](#nexus)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see this section

Comment on lines +216 to +217
[tool.uv.sources]
nexus-rpc = { path = "../nexus-sdk-python", editable = true }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make sure not to merge until this is proper dependency

priority: temporalio.common.Priority = temporalio.common.Priority.default,
# The following options are deliberately not exposed in overloads
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unresolved (but pedantic, don't have to take if not wanted, I am just afraid of users using them)

Comment on lines +5204 to +5206
nexus_completion_callbacks: Sequence[NexusCompletionCallback]
workflow_event_links: Sequence[temporalio.api.common.v1.Link.WorkflowEvent]
request_id: Optional[str]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May want to mention here these are unstable/experimental

@@ -7231,6 +7260,17 @@ def api_key(self, value: Optional[str]) -> None:
self.service_client.update_api_key(value)


@dataclass(frozen=True)
class NexusCompletionCallback:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May want to mention this is unstable/experimental and also not really for user use (I understand exposing because it's exposed in the interceptor)

Comment on lines 125 to 131
if not all(
c in "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789_-"
for c in s
):
raise ValueError(
"invalid base64URL encoded string: contains invalid characters"
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pedantic and although not really documented well, I think urlsafe_b64decode will fail on invalid chars (though can check there are no trailing = if needed)

@@ -3,22 +3,27 @@
from __future__ import annotations
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in conflict with the statement on the PR description of "Feel free to ignore those files for now". Ignoring workflow caller stuff for now.

# TODO(nexus-prerelease): headers
try:
await self._handler.cancel_operation(ctx, request.operation_token)
except Exception as err:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is the case where I think we should consider catching BaseException. I think we need to swallow all user exceptions that can occur for any reason and respond back

try:
await self._handler.cancel_operation(ctx, request.operation_token)
except Exception as err:
logger.exception("Failed to execute Nexus cancel operation method")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had traditionally treated these types of things as "warning" level not "error" level because it is not an SDK error, it is a user-code error, the SDK is just fine.

try:
await self._handler.cancel_operation(ctx, request.operation_token)
except Exception as err:
logger.exception("Failed to execute Nexus cancel operation method")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If possible, use the contextual logger that has the Nexus info on it for this and start exception


def __init__(
self,
start: Optional[
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for posterity, noting that I disagree with this approach vs a simple approach of instantiating this class with the start workflow args. I think making users do nested functions for something as simple as tying a workflow to an operation is rough.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK let's not treat that discussion as closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants