Datasets#
Overview#
- class BigBenchHard(task_name: Literal['object_counting'] = 'object_counting', root: str = None, split: Literal['train', 'val', 'test'] = 'train', *args, **kwargs)[source]#
Bases:
Dataset
Big Bench Hard dataset for object counting task.
You can find the task name from the following link: suzgunmirac/BIG-Bench-Hard
Data will be saved to ~/.adalflow/cache_datasets/BBH_object_counting/{split}.csv if root is not specified.
Size for each split: - train: 50 examples - val: 50 examples - test: 100 examples
- Parameters:
task_name (str) – The name of the task. “{task_name}” is the task name in the dataset.
root (str, optional) – Root directory of the dataset to save the data. Defaults to ~/.adalflow/cache_datasets/task_name.
split (str, optional) – The dataset split, supports
"train"
(default),"val"
and"test"
.
- class HotPotQA(only_hard_examples=True, root: str = None, split: Literal['train', 'val', 'test'] = 'train', keep_details: Literal['all', 'dev_titles', 'none'] = 'dev_titles', size: int = None, **kwargs)[source]#
Bases:
Dataset
- class Example(id: str = 'bb33e8bd-717c-476c-ba00-e7e69940c2e4', question: str = None, answer: str = None)[source]#
Bases:
DataClass
A common dataclass for representing examples in a dataset.
- id: str = 'bb33e8bd-717c-476c-ba00-e7e69940c2e4'#
- question: str = None#
- answer: str = None#
- class HotPotQAData(id: str = 'bb33e8bd-717c-476c-ba00-e7e69940c2e4', question: str = None, answer: str = None, gold_titles: set = None)[source]#
Bases:
Example
A dataclass for representing examples in the HotPotQA dataset.
- gold_titles: set = None#
- class TrecDataset(root: str = None, split: Literal['train', 'test'] = 'train')[source]#
Bases:
Dataset
Trec dataset for question classification.
Here we only load a small subset of the dataset for training and evaluation.
In default: train: 600, 100 per class, val: 36, test: 144 All class-balanced.
Reference: - https://huggingface.co/datasets/trec labels: https://huggingface.co/datasets/trec/blob/main/trec.py