dify icon indicating copy to clipboard operation
dify copied to clipboard

Change/modify the input before sending it to OPENAI as prompt

Open CodinjaoftheWorld opened this issue 1 year ago • 3 comments

Self Checks

  • [X] I have searched for existing issues search for existing issues, including closed ones.
  • [X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [X] Pleas do not modify this template :) and fill in all the required fields.

Provide a description of the help you need

I have referred the documentation but not able to find the required functions.
Need help for the documentation/reference for modifying/preprocessing the input before sending it to the LLM(OpenAI) as prompt.

CodinjaoftheWorld avatar Mar 08 '24 13:03 CodinjaoftheWorld

Hey there, @CodinjaoftheWorld! Good to see you back. What's cooking in the code kitchen today?

To modify or preprocess the input before sending it to the Language Model (OpenAI) as a prompt, you can follow these steps based on the provided context in api/core/tools/prompt/template.py:

  1. Identify the Template Type: Determine whether you are working with a chat or completion prompt based on your application's needs. Each has a slightly different template but follows a similar structure.

  2. Customize the Instruction: The {{instruction}} placeholder can be replaced with specific instructions tailored to the input you're processing. This could include specific guidelines on how the model should interpret the input or any constraints you want to apply.

  3. Tool Utilization: The {{tools}} and {{tool_names}} placeholders suggest that you can specify tools that the model can use to process the input. If you have specific preprocessing tools (like text normalization, entity recognition, etc.), you can list them here. You would replace {{tools}} with a list of available tools and {{tool_names}} with the valid "action" values corresponding to those tools.

  4. Format the Input: The input question should be formatted as specified under the Question: section. If preprocessing is needed, consider adding a step in the Thought: section where you detail the preprocessing logic or tool usage before sending the actual input to the model.

  5. JSON Blob for Actions: When specifying an action (i.e., a preprocessing step), ensure it is formatted correctly as a JSON blob. For example, if you have a preprocessing tool named "NormalizeText", you might specify it as follows:

{
  "action": "NormalizeText",
  "action_input": "Text to be normalized"
}
  1. Iterative Processing: The template allows for iterative processing (Thought/Action/Observation) before arriving at the final answer. Use this to your advantage for complex preprocessing that requires multiple steps.

  2. Final Formatting: Ensure the final action that sends the processed input to the model for generating the response is correctly formatted, with "action": "Final Answer" and the processed input as "action_input".

Here's an example of how you might modify the template for a preprocessing step that involves text normalization:

ENGLISH_REACT_COMPLETION_PROMPT_TEMPLATES = """Respond to the human as helpfully and accurately as possible. 

Instruction: Normalize the input text before processing.

You have access to the following tools:

- NormalizeText

Use a json blob to specify a tool by providing an action key (tool name) and an action_input key (tool input).
Valid "action" values: "Final Answer" or "NormalizeText"

Provide only ONE action per $JSON_BLOB, as shown:

{ "action": "NormalizeText", "action_input": "Text to be normalized" }


Follow this format:

Question: {{query}}
Thought: Consider normalizing the text to improve accuracy.
Action:

{ "action": "NormalizeText", "action_input": "{{query}}" }

Observation: action result
Thought: I know what to respond
Action:

{ "action": "Final Answer", "action_input": "Normalized response to human" }


Begin! Reminder to ALWAYS respond with a valid json blob of a single action. Use tools if necessary. Respond directly if appropriate. Format is Action:```$JSON_BLOB```then Observation:.

This example demonstrates how to integrate a preprocessing step (text normalization) into the prompt template. Adjust the tool names and actions according to your specific preprocessing needs.

As for the structure of the codebase for the repository, I'm afraid I couldn't find that information in the repository. If you could provide more details about what you're looking for, I might be able to help you better.

Sources

About Dosu

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

dosubot[bot] avatar Mar 08 '24 13:03 dosubot[bot]

When interacting with the Language Model (LLM), you may find it beneficial to modify or preprocess your input data to enhance the quality of responses or adapt it to specific use cases. This documentation provides guidance on how to modify and preprocess input before sending it to the LLM. Examples and code snippets are included to illustrate the process.

Modifying/Preprocessing Techniques: 1: Cleaning Text: Remove unnecessary characters, symbols, or special characters from the input to ensure that the LLM focuses on meaningful content. 2: Lowercasing: Convert all text to lowercase to maintain consistency and avoid the model treating the same word differently based on its case.

Example for Cleaning Text (Solution #1):

import re

def clean_text(input_text):
    # Remove special characters, symbols, and non-alphanumeric characters
    cleaned_text = re.sub(r'[^A-Za-z0-9 ]+', '', input_text)
    return cleaned_text

# Example usage
input_text = "Remove !@#$ special characters from this text."
cleaned_text = clean_text(input_text)

print(f"Original Text: {input_text}")
print(f"Cleaned Text: {cleaned_text}")

This example defines a clean_text function that uses a regular expression to remove any characters that are not letters or numbers. The function is then applied to an example input text, and the original and cleaned texts are printed.

Example for Lowercasing (Solution #2):

def lowercase_text(input_text):
    # Convert all text to lowercase
    lowercased_text = input_text.lower()
    return lowercased_text

# Example usage
input_text = "Convert This Text To Lowercase."
lowercased_text = lowercase_text(input_text)

print(f"Original Text: {input_text}")
print(f"Lowercased Text: {lowercased_text}")

This example defines a lowercase_text function that converts all characters in the input text to lowercase. The function is then applied to an example input text, and the original and lowercased texts are printed.

PandaAnshuman avatar Mar 10 '24 09:03 PandaAnshuman

Hi, @CodinjaoftheWorld! I'm helping the team manage their backlog and am marking this issue as stale.

It looks like you were seeking guidance on modifying or preprocessing input before sending it to the LLM (OpenAI) as a prompt. I provided a detailed response outlining steps to modify the input, and PandaAnshuman also contributed examples and code snippets for preprocessing techniques. The issue has been resolved with the provided guidance.

Could you please confirm if this issue is still relevant to the latest version of the repository? If it is, please let the team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.

Thank you for your understanding and contribution to the repository! If you have any further questions or issues, feel free to reach out. Dosu

dosubot[bot] avatar Mar 29 '24 16:03 dosubot[bot]