-
Notifications
You must be signed in to change notification settings - Fork 7.7k
Implement docker based command line code executor #1856
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
18 commits
Select commit
Hold shift + click to select a range
8923072
implement docker based command line code executor
jackgerrits 47866b8
undo import
jackgerrits 38d0d29
test skips
jackgerrits 83e0442
format
jackgerrits 9caf3df
fix type issue
jackgerrits 0f76e8e
skip docker tests
jackgerrits cfd5d81
fix paths
jackgerrits 05e9eef
add docs
jackgerrits 6bcc759
Merge branch 'main' into docker_executor
jackgerrits d294c93
Update __init__.py
jackgerrits 40ed5a6
class name
jackgerrits bdeaaec
precommit
jackgerrits 6454dbd
undo twoagent change
jackgerrits 9463579
use relative to directly
jackgerrits d8a0aa6
Merge branch 'main' into pr/jackgerrits/1856
ekzhu 0f5ce8e
Update, fixes, etc.
ekzhu 5a0d731
update doc
ekzhu 92311e7
Update docstring
jackgerrits File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,231 @@ | ||
from __future__ import annotations | ||
import atexit | ||
from hashlib import md5 | ||
import logging | ||
from pathlib import Path | ||
from time import sleep | ||
from types import TracebackType | ||
import uuid | ||
from typing import List, Optional, Type, Union | ||
import docker | ||
from docker.models.containers import Container | ||
from docker.errors import ImageNotFound | ||
|
||
from .local_commandline_code_executor import CommandLineCodeResult | ||
|
||
from ..code_utils import TIMEOUT_MSG, _cmd | ||
from .base import CodeBlock, CodeExecutor, CodeExtractor | ||
from .markdown_code_extractor import MarkdownCodeExtractor | ||
import sys | ||
|
||
if sys.version_info >= (3, 11): | ||
from typing import Self | ||
else: | ||
from typing_extensions import Self | ||
|
||
|
||
def _wait_for_ready(container: Container, timeout: int = 60, stop_time: int = 0.1) -> None: | ||
elapsed_time = 0 | ||
while container.status != "running" and elapsed_time < timeout: | ||
sleep(stop_time) | ||
elapsed_time += stop_time | ||
container.reload() | ||
continue | ||
if container.status != "running": | ||
raise ValueError("Container failed to start") | ||
|
||
|
||
__all__ = ("DockerCommandLineCodeExecutor",) | ||
|
||
|
||
class DockerCommandLineCodeExecutor(CodeExecutor): | ||
def __init__( | ||
self, | ||
image: str = "python:3-slim", | ||
container_name: Optional[str] = None, | ||
timeout: int = 60, | ||
work_dir: Union[Path, str] = Path("."), | ||
auto_remove: bool = True, | ||
stop_container: bool = True, | ||
): | ||
"""(Experimental) A code executor class that executes code through | ||
a command line environment in a Docker container. | ||
|
||
The executor first saves each code block in a file in the working | ||
directory, and then executes the code file in the container. | ||
The executor executes the code blocks in the order they are received. | ||
Currently, the executor only supports Python and shell scripts. | ||
For Python code, use the language "python" for the code block. | ||
For shell scripts, use the language "bash", "shell", or "sh" for the code | ||
block. | ||
|
||
Args: | ||
image (_type_, optional): Docker image to use for code execution. | ||
Defaults to "python:3-slim". | ||
container_name (Optional[str], optional): Name of the Docker container | ||
which is created. If None, will autogenerate a name. Defaults to None. | ||
timeout (int, optional): The timeout for code execution. Defaults to 60. | ||
work_dir (Union[Path, str], optional): The working directory for the code | ||
execution. Defaults to Path("."). | ||
auto_remove (bool, optional): If true, will automatically remove the Docker | ||
container when it is stopped. Defaults to True. | ||
stop_container (bool, optional): If true, will automatically stop the | ||
container when stop is called, when the context manager exits or when | ||
the Python process exits with atext. Defaults to True. | ||
|
||
Raises: | ||
ValueError: On argument error, or if the container fails to start. | ||
""" | ||
|
||
if timeout < 1: | ||
raise ValueError("Timeout must be greater than or equal to 1.") | ||
|
||
if isinstance(work_dir, str): | ||
work_dir = Path(work_dir) | ||
|
||
if not work_dir.exists(): | ||
raise ValueError(f"Working directory {work_dir} does not exist.") | ||
|
||
client = docker.from_env() | ||
|
||
# Check if the image exists | ||
try: | ||
client.images.get(image) | ||
except ImageNotFound: | ||
logging.info(f"Pulling image {image}...") | ||
# Let the docker exception escape if this fails. | ||
client.images.pull(image) | ||
|
||
if container_name is None: | ||
container_name = f"autogen-code-exec-{uuid.uuid4()}" | ||
|
||
# Start a container from the image, read to exec commands later | ||
self._container = client.containers.create( | ||
image, | ||
name=container_name, | ||
entrypoint="/bin/sh", | ||
tty=True, | ||
auto_remove=auto_remove, | ||
volumes={str(work_dir.resolve()): {"bind": "/workspace", "mode": "rw"}}, | ||
working_dir="/workspace", | ||
) | ||
self._container.start() | ||
|
||
_wait_for_ready(self._container) | ||
|
||
def cleanup(): | ||
try: | ||
container = client.containers.get(container_name) | ||
container.stop() | ||
except docker.errors.NotFound: | ||
pass | ||
|
||
atexit.unregister(cleanup) | ||
|
||
if stop_container: | ||
atexit.register(cleanup) | ||
|
||
self._cleanup = cleanup | ||
|
||
# Check if the container is running | ||
if self._container.status != "running": | ||
raise ValueError(f"Failed to start container from image {image}. Logs: {self._container.logs()}") | ||
|
||
self._timeout = timeout | ||
self._work_dir: Path = work_dir | ||
|
||
@property | ||
def timeout(self) -> int: | ||
"""(Experimental) The timeout for code execution.""" | ||
return self._timeout | ||
|
||
@property | ||
def work_dir(self) -> Path: | ||
"""(Experimental) The working directory for the code execution.""" | ||
return self._work_dir | ||
|
||
@property | ||
def code_extractor(self) -> CodeExtractor: | ||
"""(Experimental) Export a code extractor that can be used by an agent.""" | ||
return MarkdownCodeExtractor() | ||
|
||
def execute_code_blocks(self, code_blocks: List[CodeBlock]) -> CommandLineCodeResult: | ||
"""(Experimental) Execute the code blocks and return the result. | ||
|
||
Args: | ||
code_blocks (List[CodeBlock]): The code blocks to execute. | ||
|
||
Returns: | ||
CommandlineCodeResult: The result of the code execution.""" | ||
|
||
if len(code_blocks) == 0: | ||
raise ValueError("No code blocks to execute.") | ||
|
||
outputs = [] | ||
files = [] | ||
last_exit_code = 0 | ||
for code_block in code_blocks: | ||
lang = code_block.language | ||
code = code_block.code | ||
|
||
code_hash = md5(code.encode()).hexdigest() | ||
|
||
# Check if there is a filename comment | ||
ekzhu marked this conversation as resolved.
Show resolved
Hide resolved
|
||
# Get first line | ||
first_line = code.split("\n")[0] | ||
if first_line.startswith("# filename:"): | ||
filename = first_line.split(":")[1].strip() | ||
|
||
# Handle relative paths in the filename | ||
path = Path(filename) | ||
if not path.is_absolute(): | ||
path = Path("/workspace") / path | ||
path = path.resolve() | ||
try: | ||
path.relative_to(Path("/workspace")) | ||
except ValueError: | ||
return CommandLineCodeResult(exit_code=1, output="Filename is not in the workspace") | ||
else: | ||
# create a file with a automatically generated name | ||
filename = f"tmp_code_{code_hash}.{'py' if lang.startswith('python') else lang}" | ||
|
||
code_path = self._work_dir / filename | ||
with code_path.open("w", encoding="utf-8") as fout: | ||
fout.write(code) | ||
|
||
command = ["timeout", str(self._timeout), _cmd(lang), filename] | ||
|
||
result = self._container.exec_run(command) | ||
exit_code = result.exit_code | ||
output = result.output.decode("utf-8") | ||
if exit_code == 124: | ||
output += "\n" | ||
output += TIMEOUT_MSG | ||
|
||
outputs.append(output) | ||
files.append(code_path) | ||
|
||
last_exit_code = exit_code | ||
if exit_code != 0: | ||
break | ||
|
||
code_file = str(files[0]) if files else None | ||
return CommandLineCodeResult(exit_code=last_exit_code, output="".join(outputs), code_file=code_file) | ||
|
||
def restart(self) -> None: | ||
"""(Experimental) Restart the code executor.""" | ||
self._container.restart() | ||
if self._container.status != "running": | ||
raise ValueError(f"Failed to restart container. Logs: {self._container.logs()}") | ||
|
||
def stop(self) -> None: | ||
"""(Experimental) Stop the code executor.""" | ||
self._cleanup() | ||
|
||
def __enter__(self) -> Self: | ||
return self | ||
|
||
def __exit__( | ||
self, exc_type: Optional[Type[BaseException]], exc_val: Optional[BaseException], exc_tb: Optional[TracebackType] | ||
) -> None: | ||
self.stop() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.