Datasets#

Overview#


class BigBenchHard(task_name: Literal['BBH_object_counting'] = 'BBH_object_counting', root: str = None, split: Literal['train', 'val', 'test'] = 'train', *args, **kwargs)[source]#

Bases: Dataset

Big Bench Hard dataset for object counting task.

You can find the task name from the following link: suzgunmirac/BIG-Bench-Hard

Data will be saved to ~/.adalflow/cache_datasets/BBH_object_counting/{split}.csv if root is not specified.

Size for each split: - train: 50 examples - val: 50 examples - test: 100 examples

Parameters:
  • task_name (str) – The name of the task. “BHH_{task_name}” is the task name in the dataset.

  • root (str, optional) – Root directory of the dataset to save the data. Defaults to ~/.adalflow/cache_datasets/task_name.

  • split (str, optional) – The dataset split, supports "train" (default), "val" and "test".

static get_default_task_instruction()[source]#
class HotPotQA(only_hard_examples=True, root: str = None, split: Literal['train', 'val', 'test'] = 'train', keep_details: Literal['all', 'dev_titles', 'none'] = 'dev_titles', size: int = None, **kwargs)[source]#

Bases: Dataset

class Example(id: str = 'd8d60f99-32a8-4c9e-bc6e-19ae5921eedf', question: str = None, answer: str = None)[source]#

Bases: DataClass

A common dataclass for representing examples in a dataset.

id: str = 'd8d60f99-32a8-4c9e-bc6e-19ae5921eedf'#
question: str = None#
answer: str = None#
class HotPotQAData(id: str = 'd8d60f99-32a8-4c9e-bc6e-19ae5921eedf', question: str = None, answer: str = None, gold_titles: set = None)[source]#

Bases: Example

A dataclass for representing examples in the HotPotQA dataset.

gold_titles: set = None#
class TrecDataset(root: str = None, split: Literal['train', 'test'] = 'train')[source]#

Bases: Dataset

Trec dataset for question classification.

Here we only load a small subset of the dataset for training and evaluation.

In default: train: 600, 100 per class, val: 36, test: 144 All class-balanced.

Reference: - https://huggingface.co/datasets/trec labels: https://huggingface.co/datasets/trec/blob/main/trec.py

class TrecData(id: str = '2c666562-67c1-496e-801d-a72b3001e784', question: str = None, class_name: str = None, class_index: int = -1)[source]#

Bases: BaseData

A dataclass for representing examples in the TREC dataset.

question: str = None#
class_name: str = None#
class_index: int = -1#