gsm8k¶
Classes
|
Use huggingface datasets to load GSM8K dataset. |
- class GSM8K(root: str = None, split: Literal['train', 'val', 'test'] = 'train', size: int = None, **kwargs)[source]¶
Bases:
Dataset
Use huggingface datasets to load GSM8K dataset.
official_train: 7473 official_test: 1319
Our train split: 3736/2 Our val split: 3736/2 Our test split: 1319
You can use size to limit the number of examples to load.
Example:
dataset = GSM8K(split="train", size=10) print(f"example: {dataset[0]}")
The output will be:
GSM8KData(id='8fc791e6-ea1d-472c-a882-d00d0600d423', question="The result from the 40-item Statistics exam Marion and Ella took already came out. Ella got 4 incorrect answers while Marion got 6 more than half the score of Ella. What is Marion's score?", answer='24', gold_reasoning="Ella's score is 40 items - 4 items = <<40-4=36>>36 items. Half of Ella's score is 36 items / 2 = <<36/2=18>>18 items. So, Marion's score is 18 items + 6 items = <<18+6=24>>24 items.", reasoning=None)