Skip to content

Creating augmented suggester #54

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 20 commits into
base: main
Choose a base branch
from
Open

Conversation

grace-sng7
Copy link
Collaborator

No description provided.

grace-sng7 and others added 16 commits May 18, 2025 20:30
Signed-off-by: Grace Sng <[email protected]>
Signed-off-by: Grace Sng <[email protected]>
Signed-off-by: Grace Sng <[email protected]>
Copy link
Member

@amit-sharma amit-sharma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the PR @grace-sng7 I've shared a few inline comments.

I also notice that the build is failing. I'm working on a fix and will update this PR.

{
"cell_type": "code",
"source": [
"from pywhyllm.suggesters.augmented_model_suggester import AugmentedModelSuggester\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The notebook makes sense.
Can you add some description/text to the notebook so that readers can understand:

  1. Motivation: we are introducing this retrieval augmented suggester. It currently works on a specific cause pairs dataset.
  2. How it works: we fetch the pairs from that dataset.
  3. some documentation on the actual function call: can say users can give any two variable names, and also tell them what would happen if the pair was not found in the database.



class AugmentedModelSuggester(SimpleModelSuggester):
def __init__(self, llm, file_path: str = 'data/causenet-precision.jsonl.bz2'):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs docstrings for each method, so that future editors would be able to understand the code easily.

International Conference on Information &amp; Knowledge Management (CIKM '20). Association for
Computing Machinery, New York, NY, USA, 3023–3030. https://doi.org/10.1145/3340531.3412763

TODO: Add license
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, if license is known, please do add.

import re

from .simple_model_suggester import SimpleModelSuggester
from pywhyllm.utils.data_loader import *
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of *, it may be good to mention the three functions that are being imported from the data_loader module. This helps the reader understand which file a function comes from, since there are two files with import *

@grace-sng7
Copy link
Collaborator Author

@amit-sharma

Hi Dr. Sharma, thanks for the comments. I've added a few updates to the PR.

Copy link
Member

@amit-sharma amit-sharma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @grace-sng7 I just have one comment for the notebook.

{
"cell_type": "markdown",
"source": [
"Here we introduce the AugmentedModelSuggester class. Creating an instance of it enables the chosen LLM to utilize Retrieval Augmented Generation (RAG) to determine causality. It currently does this by searching the CauseNet dataset for a relevant causal pair and augmenting the LLM with the corresponding evidence/information stored in CauseNet."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will be good to add a cite for causenet here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will be good to add a cite for causenet here

I have updated the notebook with a citation for CauseNet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants