[Serve][1/n] Add deployment-scoped actors (deployment_actors) to Ray Serve#61566
[Serve][1/n] Add deployment-scoped actors (deployment_actors) to Ray Serve#61566abrarsheikh wants to merge 3 commits intomasterfrom
deployment_actors) to Ray Serve#61566Conversation
Signed-off-by: abrar <abrar@anyscale.com>
There was a problem hiding this comment.
Code Review
This pull request introduces deployment_actors, a new feature for declaring long-lived, deployment-scoped Ray actors. The changes are well-integrated across the Serve stack, from configuration and protobuf schemas to the public API, and include comprehensive tests. The implementation is solid, but I have one suggestion to improve the performance of the duplicate name validation logic.
Signed-off-by: abrar <abrar@anyscale.com>
jeffreywang-anyscale
left a comment
There was a problem hiding this comment.
went through a first pass. need a deeper look into deployment_state.py and tests.
| // Unique name for this actor within the deployment. | ||
| string name = 5; |
There was a problem hiding this comment.
nit: order fields by value
| def _loads(b): | ||
| return cloudpickle.loads(b) if b else None |
There was a problem hiding this comment.
nit: move this out of the for loop
| self.tree = serve.get_deployment_actor("prefix_tree") | ||
|
|
||
| def __call__(self, request): | ||
| ray.get(self.tree.insert.remote(request.text)) |
There was a problem hiding this comment.
nit: PrefixTreeActor in this docstring example doesn't implement insert().
| def _serialize_actor_class(self) -> None: | ||
| """Import and serialize the actor class with pickle-by-value.""" | ||
| actor_class = self.actor_class | ||
| if isinstance(actor_class, str): |
There was a problem hiding this comment.
this path is not possible because the caller on line 990 only enters when self.actor_class is not a string.
| """ | ||
| internal_context = _get_internal_replica_context() | ||
| if internal_context is None: | ||
| from ray.serve.exceptions import RayServeException |
There was a problem hiding this comment.
this is already imported at the beginning of the file
| self.tree = serve.get_deployment_actor("prefix_tree") | ||
|
|
||
| def __call__(self, request): | ||
| ray.get(self.tree.insert.remote(request.text)) |
There was a problem hiding this comment.
same nit here: insert() is not implemented by this docstring example
| class DeploymentActors(DeploymentActorContainer): | ||
| """Backward-compatible alias for DeploymentActorContainer.""" |
There was a problem hiding this comment.
is there a reason why we need backward compatibility for DeploymentActorContainer? isn't it first introduced in this PR?
| self._versions_ready.discard(code_ver) | ||
| self._wrappers_by_version.pop(code_ver, None) | ||
|
|
||
| if versions_to_keep is None: |
There was a problem hiding this comment.
is it possible for us to leak pending actors under the following situation?
- first version is deployed
- second version arrives while the first version is in PENDING state
- now, the target version is the second version and added to
versions_to_keep
throughout the deployment lifetime, the actors associated with the first version are leaked.
DeploymentActorConfig, a new config for declaring long-lived, deployment-scoped Ray actors that are shared across all replicas of a deployment and managed by the Serve controller.serve.get_deployment_actor(name)API for replicas to obtain a handle to a deployment-scoped actor by name.deployment_actorsthrough the full config stack: protobuf schema,DeploymentConfig(with proto serialization roundtrip),DeploymentSchema,Deployment.options(), and the declarativeServeDeploySchemapath.deployment_actorsinget_app_code_versionso that changes to deployment actor configs trigger redeployment in the declarative API.Test plan
test_config.py—DeploymentActorConfigcreation, proto roundtrip, duplicate-name validation, import-path resolution.test_schema.py— schema validation (dict, unset, None),deployment_to_schema/schema_to_deploymentroundtrip, dict-based schema,get_app_code_versionhash sensitivity.test_common.py—get_deployment_actor_namedeterministic naming.test_api.py—deployment_actorsas dict andDeploymentActorConfigobject via Python API (serve.run);get_deployment_actorerror outside replica.test_deploy_app.py— declarative API (client.deploy_appswithServeDeploySchema) withdeployment_actors, verifying app reaches RUNNING and serves traffic.