Skip to content

bug: Cannot sanitize user input #882

@vrige

Description

@vrige

Did you check docs and existing issues?

  • I have read all the NeMo-Guardrails docs
  • I have updated the package to the latest version before submitting this issue
  • (optional) I have used the develop branch
  • I have searched the existing issues of NeMo-Guardrails

Python version (python --version)

Python 3.10.15

Operating system/version

MacOs 14

NeMo-Guardrails version (if you must use a specific version and not the latest

No response

Describe the bug

Hi, thanks for the amazing work you are doing.

I am having some problems related to input sanitization using a custom action (sanitizeInputAction). The action works just fine when looking at the logs, but the LLM receives the initial input instead of the sanitized one.

I have followed the steps from the documentation:

  • Input Rails Only

  • I also found the following test for colang.v1 here, but it doesn't seem to be helpful for colang.v2. I haven't found a corresponding case in v2 yet.

An example with fake data:

user: "Hi! Can you repeat the following name: Tom Jerry. Thanks"
llm: "Hi! Can you repeat the following name: <PERSON>. Thanks
Sure, the name is Tom Jerry. Did you want me to repeat it again?"

I am open to any suggestions.

Is there something I missed? Do I need to add more code?

Thanks for your help!

Steps To Reproduce

colang_v2 = """
    import core 
    import llm

    flow main
        $ref_act = await analyse_input 
        $output = ... "'{$ref_act}'"

    flow analyse_input-> $result
        user said something as $ref_use   
        $result = await sanitizeInputAction(inputs=$ref_use.transcript)
        return $result
    """

config_v2_o = """
    colang_version: 2.x
    rails:
        input:
            flows:
                - main
    models:
      - type: main
        engine: openai
        model: gpt-3.5-turbo-instruct
    """

Expected Behavior

I expect:

  • the flow to be triggered for any user input text (it's working)
  • the sanitization of the input text (it's working)
  • a reply must be generated by the llm using only the sanitized input text (not working)
  • not repeating the input text (not working)

Actual Behavior

I expect:

  • the flow to be triggered for any user input text
  • the sanitization of the input text
  • llm is using personal data as input
  • a repetition of the input sanitized text

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingstatus: waiting confirmationIssue is waiting confirmation whether the proposed solution/workaround works.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions