Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve function call (#279) #393

Merged
merged 18 commits into from
Jan 14, 2024
Merged

Improve function call (#279) #393

merged 18 commits into from
Jan 14, 2024

Conversation

forever-ly
Copy link
Collaborator

@forever-ly forever-ly commented Nov 28, 2023

Description

  1. The current function calling is implemented by parsing the docstring to JSON schema which causes inconsistency in docstring styles. A promising way to improve it is using pydantic like instructor (close [Feature Request] Improve function calling implementation #279 )
  2. The current function calling implementation does not support Enum. For example, the get_weather_data tool may not work as expected since the input arguments like time_units should be enum types: #334. The agent may hallucinate an invalid input if enum types are not supported.(close [Feature Request] Support Enum type in function calling #360)
  3. Support for the new version of the function call format (the functions parameter in client.chat.completions.create has been deprecated and replaced by tool_calls since openai version >= 1.2.3)
  4. Support user-defined function/tool call json description (e.g. read from a file)

Motivation and Context

The openai function/tool call feature requires obtaining the json schema description of the function/tool. The current implementation in camel involves using camel.utils.commons.parse_doc to parse Google-style docstrings. There are the following issues:

  1. It does not support tool call. (The functions parameter in client.chat.completions.create has been deprecated and replaced by tool_calls since openai version >= 1.2.3)
  2. It only supports Google-style comment styles, which is not suitable when users prefer other comment styles or when introducing functions from third-party libraries. The current code (as shown below) supports only Google-style parsing."
def parse_doc(func: Callable) -> Dict[str, Any]:  
	...
    if args_section:  
        args_descs: List[Tuple[str, str, str, ]] = re.findall(  
            r'(\w+)\s*\((\w+)\):\s*(.*)', args_section)  
        properties = {  
            name.strip(): {  
                'type': type,  
                'description': desc  
            }  
            for name, type, desc in args_descs  
        }  
        for name in properties:  
            required.append(name)  
    ...
  

The solution is to use docstring_parser, which is a library that provides parsing for various docstring styles, including ReST, Google, Numpydoc-style, and Epydoc docstrings

  1. Problems with type parsing. The current parse_doc gets the type information of parameters by parsing the docstring, so we have to use the json schema datatype instead of the python datatype when writing docstring, and in addition, data types such as Enum cannot be supported this way. For example the current docstring must be written as
    """Adds two numbers.  
    Args:        
    a (integer): the description of "a"    
    b (string): the description of "b"
    """    

If we write it in the following form, the parsed types will be int and str, which is inconsistent with the type of the `json schema that the openai function call accepts

    """Adds two numbers.  
    Args:        
    a (int): the description of "a"    
    b (str): the description of "b"
    """    

Therefore, after referring to the corresponding implementation in the llama_index and instructor library, the inspect module is used to extract the type information from the function signature instead of the docstring, and then pydantic is used to complete the conversion of python types to json schema types. More complex data types can also be supported in this implementation, such as Enum, `datatime

In conclusion, I defined get_openai_function_schema and get_openai_tool_schema to replace the parse_doc function to:

  • Support for the new tool call
  • Automatically convert python parameter types to json schema.
  • Support for data types such as enums and default parameters.
    For example, for the following function
def test_all_parameters(  
        str_para: str,  
        int_para: int,  
        list_para: List[int],  
        float_para: float,  
        datatime_para: datetime,  
        default_enum_para: RoleType = RoleType.CRITIC,  
          
):  
    """  
    A function to test all parameter type. The parameters will be provided by user.    
    Args: 
	    str_para (str):        
	    str_para: str_para desc        
	    int_para (int): int_para desc  
        list_para (List): list_para desc        
        float_para (float): float_para desc        
        datatime_para (datetime): datatime_para desc        
        default_enum_para (RoleType): default_enum_para desc    
    """

The result of parse_doc is:

{
'name': 'test_all_parameters',
'description': 'A function to test all parameter type. The parameters will be provided by user.',
'parameters': {
	'type': 'object',
	'properties': {
	'str_para': {'type': 'str','description': 'str_para: str_para desc  '},
	'int_para': {'type': 'int', 'description': 'int_para desc'},
	'list_para': {'type': 'List', 'description': 'list_para desc'},
	'float_para': {'type': 'float', 'description': 'float_para desc'},
	'datatime_para': {'type': 'datetime', 'description': 'datatime_para desc'},
	'default_enum_para': {'type': 'RoleType','description': 'default_enum_para desc'}
	},
'required': ['str_para','int_para','list_para','float_para','datatime_para','default_enum_para']}}

The result of get_openai_function_schema is:

{
'name': 'test_all_parameters',
'description': 'A function to test all parameter type. The parameters will be provided by user.',
'parameters': {
	'$defs': {'RoleType': {'enum': ['assistant','user','critic','embodiment','default'],'type': 'string'}},
	'properties': {
		'str_para': {'type': 'string'},
		'int_para': {'type': 'integer'},
		'list_para': {'items': {'type': 'integer'}, 'type': 'array'},
		'float_para': {'type': 'number'},
		'datatime_para': {'format': 'date-time', 'type': 'string'},
		'default_enum_para': {'allOf': [{'$ref': '#/$defs/RoleType'}],
		'default': 'critic'}
		},
	'required': ['str_para','int_para','list_para','float_para','datatime_para'],
	'type': 'object'
	}
}

The result of get_openai_tool_schema is:

{
'type': 'function',
 'function': {
	'name': 'test_all_parameters',
	'description': 'A function to test all parameter type. The parameters will be provided by user.',
	'parameters': {
		'$defs': {'RoleType': {'enum': ['assistant','user','critic','embodiment','default'],'type': 'string'}},
		'properties': {
			'str_para': {'type': 'string'},
			'int_para': {'type': 'integer'},
			'list_para': {'items': {'type': 'integer'}, 'type': 'array'},
			'float_para': {'type': 'number'},
			'datatime_para': {'format': 'date-time', 'type': 'string'},
			'default_enum_para': {'allOf': [{'$ref': '#/$defs/RoleType'}],
			'default': 'critic'}
			},
		'required': ['str_para','int_para','list_para','float_para','datatime_para'],
		'type': 'object'
		}
	}

 }

In addition,the function call module needs to provide:

  • user-defined function call json description (e.g. read from a file)
  • modify the part of the json description
  • Verify that the json description i is valid
    Therefore, I have modified class OpenFunction to provide the above functionality:
  • validate_openai_tool_schema: Validates the format of the tool schema against the json schema specification
  • get_openai_tool_scheme;set_openai_tool_schema
  • get_openai_function_schema;set_openai_function_schema
  • get_function_name;set_function_name
  • get_function_description;set_function_description
  • get_paramter_description;set_paramter_description
  • get_parameter;set_parameter
  • parameters

Types of changes

What types of changes does your code introduce? Put an x in all the boxes that apply:

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds core functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (update in the documentation)
  • Example (update in the folder of example)

Checklist

Go over all the following points, and put an x in all the boxes that apply.
If you are unsure about any of these, don't hesitate to ask. We are here to help!

  • I have read the CONTRIBUTION guide. (required)
  • My change requires a change to the documentation.
  • I have updated the tests accordingly. (required for a bug fix or a new feature)
  • I have updated the documentation accordingly.

Copy link
Collaborator

@dandansamax dandansamax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! It makes the feature much easier to uses. Thanks a lot. Left some minor suggestions to improve readability of code.

camel/utils/commons.py Outdated Show resolved Hide resolved
camel/utils/commons.py Outdated Show resolved Hide resolved
test/utils/test_get_openai_tool_schema.py Outdated Show resolved Hide resolved
camel/functions/openai_function.py Outdated Show resolved Hide resolved
test/utils/test_get_openai_tool_schema.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@dandansamax dandansamax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late review. Your code looks awesome and let's wait for @lightaime to have a look. During waiting, could you add an example to show how to add a custom function? I believe it will be useful.

@dandansamax
Copy link
Collaborator

Tests failed because pydantic 2.x conflicts with argilla. We are solving it.

Copy link

@Billy1900 Billy1900 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the test function is good and comprehensive.

"""
return a * b

expect_res = json.loads("""{"type": "function",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JSON could be formatted here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point

@lightaime
Copy link
Member

Thanks @forever-ly for this very helpful PR!! This is awesome. Also thanks everyone for the review. I will go ahead and merge it. Since it is a huge and helpful PR, please feel free to review again and let me know is there any other issue.

@lightaime lightaime merged commit 067b558 into master Jan 14, 2024
6 checks passed
@lightaime lightaime deleted the improve_function_call branch January 14, 2024 04:05
Copy link
Member

@Wendong-Fan Wendong-Fan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High-quality code!

Copy link

@Billy1900 Billy1900 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clean code and its structure.

@yiyiyi0817
Copy link
Member

Hello, @forever-ly. I seem to have encountered a bug and I'm curious if get_openai_tool_schema can parse data of type Tuple[float, float]. I have a function defined as follows: def get_elevation(lat_lng: Tuple[float, float]) -> str:. However, this leads to an error: openai.BadRequestError: Error code: 400 - {'error': {'message': "Invalid schema for function 'get_elevation': In context=('properties', 'lat_lng'), array schema missing items", 'type': 'invalid_request_error', 'param': None, 'code': None}}. Interestingly, when I do not specify the float type, i.e., def get_elevation(lat_lng: Tuple) -> str:, the issue does not occur.

Then I found the output of get_openai_tool_schema for def get_elevation(lat_lng: Tuple[float, float]) -> str:
{ "type": "function", "function": { "name": "get_elevation", "description": "Retrieves elevation data for a given latitude and longitude.\nUses the Google Maps API to fetch elevation data for the specified latitude\nand longitude. It handles exceptions gracefully and returns a description\nof the elevation, including its value in meters and the data resolution.", "parameters": { "properties": { "lat_lng": { "maxItems": 2, "minItems": 2, "prefixItems": [ { "type": "number" }, { "type": "number" } ], "type": "array", "description": "The latitude and longitude for\nwhich to retrieve elevation data." } }, "required": ["lat_lng"], "type": "object" } } }
It has a key named "prefixItems" instead of "items".

But for def get_elevation(lat_lng: Tuple) -> str: "items" exist as following:
{ "type": "function", "function": { "name": "get_elevation", "description": "Retrieves elevation data for a given latitude and longitude.\nUses the Google Maps API to fetch elevation data for the specified latitude\nand longitude. It handles exceptions gracefully and returns a description\nof the elevation, including its value in meters and the data resolution.", "parameters": { "properties": { "lat_lng": { "items": {}, "type": "array", "description": "The latitude and longitude for\nwhich to retrieve elevation data." } }, "required": ["lat_lng"], "type": "object" } } }

I suspect that this issue might be due to OpenAI not accepting the "prefixItems" key, which is produced by the get_openai_tool_schema function. I'm not entirely sure if my analysis is correct, and I would greatly appreciate your insights on this matter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Merged or Closed
6 participants