Skip to content

Conversation

rooa
Copy link
Contributor

@rooa rooa commented May 26, 2022

What does this PR do?

Adds CodeGen PyTorch model.

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@lvwerra @patil-suraj @loubnabnl

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented May 26, 2022

The documentation is not available anymore as the PR was closed or merged.

Copy link
Contributor

@patil-suraj patil-suraj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for adding this model @rooa ! The code is looking good already. Left some comments. Specifically:

  • We should add as much Copied from ... statements as possible
  • We should remove the manual parallelization logic, more details in the comment below.

Let me know if you have any other questions. I will look into tests after these changes :)

Copy link
Contributor

@patil-suraj patil-suraj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for addressing the comments, the PR looks almost ready. Just left a comment about tie_word_embeddings.

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this new model!
Good for me with @patil-suraj comments.

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me in general. Would be nice if we could give self.bias a better name - think it'd make reading the code much easier

]:

qkv = self.qkv_proj(hidden_states)
# TODO(enijkamp): factor out number of logical TPU-v4 cores or make forward pass agnostic
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(out of curiosity) what does the comment mean here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@patil-suraj why resolve here?

Copy link
Contributor

@patil-suraj patil-suraj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @rooa ! This looks good for merge now, once @patrickvonplaten's comment is addressed.

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some things to clean up before merging:

  • Some tests are failing
  • We should add a truncate_before_pattern function arg that takes a list of patterns before which we truncate. I think it's important to stay flexible here

return pairs


class CodeGenTokenizer(PreTrainedTokenizer):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think we should add some more # Copied from ... statements here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@patil-suraj why resolve here if there hasn't been an answer or change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad. Here we add an extra method truncate in the tokenizer , so didn't add the # Copied from ... statement.

@sam-h-bean
Copy link
Contributor

Hey @patil-suraj @rooa you should go fetch upstream on your fork. There were some test fixes that I think you are missing which is causing the red exes that no one likes to see. I actually would love to use this but I can't because this PR is not merged yet!

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to merge for me!

@patil-suraj
Copy link
Contributor

Merging now! Thanks a lot @rooa for working on this and being patient with the review and tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants