Skip to content

Adding pan and scan to gemma 3 #2216

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

SindhuRaghuram97
Copy link
Collaborator

Adding a gemma3_utils file to include modules supporting the implementation of pan and scan for Gemma 3

@github-actions github-actions bot added the Gemma Gemma model specific issues label Apr 15, 2025
Copy link
Collaborator

@sachinprasadhs sachinprasadhs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. I've left some general comments on the code design to align it with the Keras standard.

@@ -0,0 +1,255 @@
from enum import Enum
from typing import Optional, Union, Tuple, Dict, List
import tensorflow as tf
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove unused imports

LAST = "channels_last"

def infer_channel_dimension_format(
image: np.ndarray, num_channels: Optional[Union[int, Tuple[int, ...]]] = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove type annotation, we don't do type annotation in Keras.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be this can be used to get the image data format?

def standardize_data_format(data_format):

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like,

  data_format = standardize_data_format(data_format)
  h_axis, w_axis, channels_axis = (
      (-3, -2, -1) if data_format == "channels_last" else (-2, -1, -3)
  )

Args:
image (`np.ndarray`):
The image to infer the channel dimension of.
num_channels (`int` or `Tuple[int, ...]`, *optional*, defaults to `(1, 3)`):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usual convention for args is
num_channels: int or tuple of 2 integers, optional. The number of channels of the image. Defaults to (1, 3).

You can apply the same format for all the Args details in this PR.

@mattdangerw
Copy link
Member

I think we should probably close this PR.

This looks like a general image processing utility, I don't see much Gemma specific about it.

We could add functionality like this to core Keras if we think it is generally useful and the inputs and output are clear. Or we could just leave this to any downstream code that needs pan and scan if it's going to be hard to make something general.

@SindhuRaghuram97 let me know if that makes sense!

@SindhuRaghuram97
Copy link
Collaborator Author

I think we should probably close this PR.

This looks like a general image processing utility, I don't see much Gemma specific about it.

We could add functionality like this to core Keras if we think it is generally useful and the inputs and output are clear. Or we could just leave this to any downstream code that needs pan and scan if it's going to be hard to make something general.

@SindhuRaghuram97 let me know if that makes sense!

Hello, yes this wasn't really supposed to be specific to Gemma but can be used in tandem with Gemma 3 if the user wishes to do so. And I believe Pan and Scan was a requirement to have within the codebase. Apologies for the delay, I haven't been able to address the comments due to a forthcoming release but I'll have it done within the next few days

@mattdangerw
Copy link
Member

Maybe worth starting with the usage. How would an end user use this option in a minimal example? What's exposed?

Copy link

This PR is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.

@github-actions github-actions bot added the stale label May 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants