functional#

Functional interface. Core functions we use to build across the components. Users can leverage these functions to customize their own components.

Functions

check_data_class_field_args_one(cls)

Check if the field is a dataclass.

check_data_class_field_args_zero(cls)

Check if the field is a dataclass.

check_if_class_field_args_one_exists(cls)

Check if the field is a dataclass.

check_if_class_field_args_zero_exists(cls)

Check if the field is a dataclass.

compose_model_kwargs(default_model_kwargs, ...)

Add new arguments or overwrite the default arguments with the new arguments.

convert_schema_to_signature(schema)

Convert the value from get_data_class_schema to a string description.

custom_asdict(obj, *[, dict_factory, exclude])

Equivalent to asdict() from dataclasses module but with exclude fields.

dataclass_obj_from_dict(cls, data)

Convert a dictionary to a dataclass object.

evaluate_ast_node(node[, context_map])

Recursively evaluates an AST node and returns the corresponding Python object.

extract_dataclass_type(type_hint)

Extract the actual dataclass type from a type hint that could be Optional or other generic.

extract_first_boolean(text)

Extract the first boolean from the provided text.

extract_first_float(text)

Extract the first float from the provided text.

extract_first_int(text)

Extract the first integer from the provided text.

extract_function_expression(text[, ...])

Extract function expression from text.

extract_json_str(text[, add_missing_right_brace])

Extract JSON string from text.

extract_list_str(text[, ...])

Extract the first complete list string from the provided text.

extract_yaml_str(text)

Extract YAML string from text.

fix_json_escaped_single_quotes(json_str)

fix_json_missing_commas(json_str)

from_dict_to_json(data[, sort_keys])

Convert a dictionary to a JSON string.

from_dict_to_yaml(data[, sort_keys])

Convert a dictionary to a YAML string.

from_json_to_dict(json_str)

Convert a JSON string to a dictionary.

from_yaml_to_dict(yaml_str)

Convert a YAML string to a dictionary.

generate_function_call_expression_from_callable(...)

Generate a function call expression string from a callable function and its arguments.

generate_readable_key_for_function(fn)

get_dataclass_schema(cls[, exclude, ...])

Generate a schema dictionary for a dataclass including nested structures.

get_enum_schema(enum_cls)

get_fun_schema(name, func)

Get the schema of a function.

get_top_k_indices_scores(scores, top_k)

get_type_schema(type_obj[, exclude, ...])

Retrieve the type name, handling complex and nested types.

is_dataclass_instance(obj)

is_normalized(v[, tol])

is_potential_dataclass(t)

Check if the type is directly a dataclass or potentially a wrapped dataclass like Optional.

normalize_np_array(v)

normalize_vector(v)

parse_function_call_expr(function_expr[, ...])

Parse a string representing a function call into its components and ensure safe execution by only allowing function calls from a predefined context map. :param function_expr: The string representing the function :type function_expr: str :param context_map: A dictionary that maps variable names to their respective values and functions. This context is used to resolve names and execute functions. :type context_map: Dict[str, Any].

parse_json_str_to_obj(json_str)

Parse a varietry of json format string to Python object.

parse_yaml_str_to_obj(yaml_str)

Parse a YAML string to a Python object.

random_sample(dataset, num_shots[, replace, ...])

Randomly sample num_shots from the dataset.

represent_ordereddict(dumper, data)

sandbox_exec(code[, context, timeout])

Execute code in a sandboxed environment with a timeout.

validate_data(data, fieldtypes)

custom_asdict(obj, *, dict_factory=<class 'dict'>, exclude: ~typing.Dict[str, ~typing.List[str]] | None = None) Dict[str, Any][source]#

Equivalent to asdict() from dataclasses module but with exclude fields.

Return the fields of a dataclass instance as a new dictionary mapping field names to field values, while allowing certain fields to be excluded.

If given, ‘dict_factory’ will be used instead of built-in dict. The function applies recursively to field values that are dataclass instances. This will also look into built-in containers: tuples, lists, and dicts.

validate_data(data: Dict[str, Any], fieldtypes: Dict[str, Any]) bool[source]#
is_potential_dataclass(t)[source]#

Check if the type is directly a dataclass or potentially a wrapped dataclass like Optional.

extract_dataclass_type(type_hint)[source]#

Extract the actual dataclass type from a type hint that could be Optional or other generic.

check_data_class_field_args_zero(cls)[source]#

Check if the field is a dataclass.

check_if_class_field_args_zero_exists(cls)[source]#

Check if the field is a dataclass.

check_data_class_field_args_one(cls)[source]#

Check if the field is a dataclass.

check_if_class_field_args_one_exists(cls)[source]#

Check if the field is a dataclass.

dataclass_obj_from_dict(cls: Type[object], data: Dict[str, object]) Any[source]#

Convert a dictionary to a dataclass object.

Supports nested dataclasses, lists, and dictionaries.

Note

If any required field is missing, it will raise an error. Do not use the dict that has excluded required fields.

Example:

from dataclasses import dataclass
from typing import List

@dataclass
class TrecData:
    question: str
    label: int

@dataclass
class TrecDataList:

    data: List[TrecData]
    name: str

trec_data_dict = {"data": [{"question": "What is the capital of France?", "label": 0}], "name": "trec_data_list"}

dataclass_obj_from_dict(TrecDataList, trec_data_dict)

# Output:
# TrecDataList(data=[TrecData(question='What is the capital of France?', label=0)], name='trec_data_list')
represent_ordereddict(dumper, data)[source]#
from_dict_to_json(data: Dict[str, Any], sort_keys: bool = False) str[source]#

Convert a dictionary to a JSON string.

from_dict_to_yaml(data: Dict[str, Any], sort_keys: bool = False) str[source]#

Convert a dictionary to a YAML string.

from_json_to_dict(json_str: str) Dict[str, Any][source]#

Convert a JSON string to a dictionary.

from_yaml_to_dict(yaml_str: str) Dict[str, Any][source]#

Convert a YAML string to a dictionary.

is_dataclass_instance(obj)[source]#
get_type_schema(type_obj, exclude: Dict[str, List[str]] | None = None, type_var_map: Dict | None = None) str[source]#

Retrieve the type name, handling complex and nested types.

get_enum_schema(enum_cls: Type[Enum]) Dict[str, object][source]#
get_dataclass_schema(cls, exclude: Dict[str, List[str]] | None = None, type_var_map: Dict | None = None) Dict[str, Dict[str, object]][source]#

Generate a schema dictionary for a dataclass including nested structures.

  1. Support customized dataclass with required_field function.

  2. Support nested dataclasses, even with generics like List, Dict, etc.

  3. Support metadata in the dataclass fields.

convert_schema_to_signature(schema: Dict[str, Dict[str, Any]]) Dict[str, str][source]#

Convert the value from get_data_class_schema to a string description.

get_fun_schema(name: str, func: Callable[[...], object]) Dict[str, object][source]#

Get the schema of a function. Support dataclass, Union and normal data types such as int, str, float, etc, list, dict, set.

Examples: def example_function(x: int, y: str = “default”) -> int:

return x

schema = get_fun_schema(“example_function”, example_function) print(json.dumps(schema, indent=4)) # Output: {

“type”: “object”, “properties”: {

“x”: {

“type”: “int”

}, “y”: {

“type”: “str”, “default”: “default”

}

}, “required”: [

“x”

]

}

evaluate_ast_node(node: AST, context_map: Dict[str, Any] = None)[source]#

Recursively evaluates an AST node and returns the corresponding Python object.

Parameters:
  • node (ast.AST) – The AST node to evaluate. This node can represent various parts of Python expressions, such as literals, identifiers, lists, dictionaries, and function calls.

  • context_map (Dict[str, Any]) – A dictionary that maps variable names to their respective values and functions. This context is used to resolve names and execute functions.

Returns:

The result of evaluating the node. The type of the returned object depends on the nature of the node:
  • Constants return their literal value.

  • Names are looked up in the context_map.

  • Lists and tuples return their contained values as a list or tuple.

  • Dictionaries return a dictionary with keys and values evaluated.

  • Function calls invoke the function with evaluated arguments and return its result.

Return type:

Any

Raises:

ValueError – If the node type is unsupported, a ValueError is raised indicating the inability to evaluate the node.

parse_function_call_expr(function_expr: str, context_map: Dict[str, Any] = None) Tuple[str, List[Any], Dict[str, Any]][source]#

Parse a string representing a function call into its components and ensure safe execution by only allowing function calls from a predefined context map. :param function_expr: The string representing the function :type function_expr: str :param context_map: A dictionary that maps variable names to their respective values and functions.

This context is used to resolve names and execute functions.

generate_function_call_expression_from_callable(func: Callable[[...], Any], *args: Any, **kwargs: Any) str[source]#

Generate a function call expression string from a callable function and its arguments.

Parameters:
  • func (Callable[..., Any]) – The callable function.

  • *args (Any) – Positional arguments to be passed to the function.

  • **kwargs (Any) – Keyword arguments to be passed to the function.

Returns:

The function call expression string.

Return type:

str

sandbox_exec(code: str, context: Dict[str, object] | None = None, timeout: int = 5) Dict[source]#

Execute code in a sandboxed environment with a timeout.

  1. Works similar to eval(), but with timeout and context similar to parse_function_call_expr.

  2. With more flexibility as you can write additional function in the code compared with simply the function call.

Parameters:
  • code (str) – The code to execute. Has to be output=… or similar so that the result can be captured.

  • context (Dict[str, Any]) – The context to use for the execution.

  • timeout (int) – The execution timeout in seconds.

compose_model_kwargs(default_model_kwargs: Dict, model_kwargs: Dict) Dict[source]#

Add new arguments or overwrite the default arguments with the new arguments.

Example: model_kwargs = {“temperature”: 0.5, “model”: “gpt-3.5-turbo”} self.model_kwargs = {“model”: “gpt-3.5”} combine_kwargs(model_kwargs) => {“temperature”: 0.5, “model”: “gpt-3.5-turbo”}

is_normalized(v: List[float] | ndarray, tol=0.0001) bool[source]#
normalize_np_array(v: ndarray) ndarray[source]#
normalize_vector(v: List[float] | ndarray) List[float][source]#
get_top_k_indices_scores(scores: List[float] | ndarray, top_k: int) Tuple[List[int], List[float]][source]#
generate_readable_key_for_function(fn: Callable) str[source]#
extract_first_int(text: str) int[source]#

Extract the first integer from the provided text.

Parameters:

text (str) – The text containing potential integer data.

Returns:

The extracted integer.

Return type:

int

Raises:

ValueError – If no integer is found in the text.

extract_first_float(text: str) float[source]#

Extract the first float from the provided text.

Parameters:

text (str) – The text containing potential float data.

Returns:

The extracted float.

Return type:

float

Raises:

ValueError – If no float is found in the text.

extract_first_boolean(text: str) bool[source]#

Extract the first boolean from the provided text.

Parameters:

text (str) – The text containing potential boolean data.

Returns:

The extracted boolean.

Return type:

bool

Raises:

ValueError – If no boolean is found in the text.

extract_function_expression(text: str, add_missing_right_parenthesis: bool = True) str[source]#

Extract function expression from text.

It will extract the first function expression found in the text by searching for ‘(‘. If the right parenthesis is not found, we add one to the end of the string.

Parameters:
  • text (str) – The text containing the potential function expression.

  • add_missing_right_parenthesis (bool) – Whether to add a missing right parenthesis if it is missing.

Returns:

The extracted function expression.

Return type:

str

Raises:

ValueError – If no function expression is found or if the function extraction is incomplete without the option to add a missing parenthesis.

extract_json_str(text: str, add_missing_right_brace: bool = True) str[source]#

Extract JSON string from text.

It will extract the first JSON object or array found in the text by searching for { or [. If right brace is not found, we add one to the end of the string.

Parameters:
  • text (str) – The text containing potential JSON data.

  • add_missing_right_brace (bool) – Whether to add a missing right brace if it is missing.

Returns:

The extracted JSON string.

Return type:

str

Raises:

ValueError – If no JSON object or array is found or if the JSON extraction is incomplete without the option to add a missing brace

extract_list_str(text: str, add_missing_right_bracket: bool = True) str[source]#

Extract the first complete list string from the provided text.

If the list string is incomplete (missing the closing bracket), an option allows adding a closing bracket at the end.

Parameters:
  • text (str) – The text containing potential list data.

  • add_missing_right_bracket (bool) – Whether to add a closing bracket if it is missing.

Returns:

The extracted list string.

Return type:

str

Raises:

ValueError – If no list is found or if the list extraction is incomplete without the option to add a missing bracket.

extract_yaml_str(text: str) str[source]#

Extract YAML string from text.

Note

As yaml string does not have a format like JSON which we can extract from {} or [], it is crucial to have a format such as `yaml` or `yml` to indicate the start of the yaml string.

Parameters:

text (str) – The text containing potential YAML data.

Returns:

The extracted YAML string.

Return type:

str

Raises:

ValueError – If no YAML string is found in the text.

fix_json_missing_commas(json_str: str) str[source]#
fix_json_escaped_single_quotes(json_str: str) str[source]#
parse_yaml_str_to_obj(yaml_str: str) Dict[str, Any][source]#

Parse a YAML string to a Python object.

yaml_str: has to be a valid YAML string.

parse_json_str_to_obj(json_str: str) Dict[str, Any] | List[Any][source]#

Parse a varietry of json format string to Python object.

json_str: has to be a valid JSON string. Either {} or [].

random_sample(dataset: Sequence[T_co], num_shots: int, replace: bool | None = False, weights: List[float] | None = None, delta: float = 1e-05) List[T_co][source]#

Randomly sample num_shots from the dataset. If replace is True, sample with replacement.