-
Notifications
You must be signed in to change notification settings - Fork 280
Adding pan and scan to gemma 3 #2216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR. I've left some general comments on the code design to align it with the Keras standard.
@@ -0,0 +1,255 @@ | |||
from enum import Enum | |||
from typing import Optional, Union, Tuple, Dict, List | |||
import tensorflow as tf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove unused imports
LAST = "channels_last" | ||
|
||
def infer_channel_dimension_format( | ||
image: np.ndarray, num_channels: Optional[Union[int, Tuple[int, ...]]] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove type annotation, we don't do type annotation in Keras.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be this can be used to get the image data format?
def standardize_data_format(data_format): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something like,
data_format = standardize_data_format(data_format)
h_axis, w_axis, channels_axis = (
(-3, -2, -1) if data_format == "channels_last" else (-2, -1, -3)
)
Args: | ||
image (`np.ndarray`): | ||
The image to infer the channel dimension of. | ||
num_channels (`int` or `Tuple[int, ...]`, *optional*, defaults to `(1, 3)`): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
usual convention for args is
num_channels: int or tuple of 2 integers, optional. The number of channels of the image. Defaults to (1, 3)
.
You can apply the same format for all the Args details in this PR.
I think we should probably close this PR. This looks like a general image processing utility, I don't see much Gemma specific about it. We could add functionality like this to core Keras if we think it is generally useful and the inputs and output are clear. Or we could just leave this to any downstream code that needs pan and scan if it's going to be hard to make something general. @SindhuRaghuram97 let me know if that makes sense! |
Hello, yes this wasn't really supposed to be specific to Gemma but can be used in tandem with Gemma 3 if the user wishes to do so. And I believe Pan and Scan was a requirement to have within the codebase. Apologies for the delay, I haven't been able to address the comments due to a forthcoming release but I'll have it done within the next few days |
Maybe worth starting with the usage. How would an end user use this option in a minimal example? What's exposed? |
This PR is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you. |
Adding a gemma3_utils file to include modules supporting the implementation of pan and scan for Gemma 3