sampler¶
The sampler here is designed to sample examples in few-shots ICL.
It differs from PyTorch’s Sampler at torch.utils.data.sampler, which is used to sample data for training.
Our sampler directly impact the few-shot examples and can lead to different performance in the few-shot ICL.
Classes
|
Sample from the dataset based on the class labels. |
|
Simple random sampler to sample from the dataset. |
|
Output data structure for each sampled data in the sequence. |
|
- class Sample(index: int, data: T_co)[source]¶
Bases:
Generic
[T_co
]Output data structure for each sampled data in the sequence.
- index: int¶
- data: T_co¶
- class Sampler(*args, **kwargs)[source]¶
Bases:
Generic
[T_co
]- dataset: Sequence[object] = None¶
- class RandomSampler(dataset: Sequence[T_co] | None = None, default_num_shots: int | None = None)[source]¶
Bases:
Sampler
,Generic
[T_co
]Simple random sampler to sample from the dataset.
- random_replace(shots: int, samples: List[Sample[T_co]], replace: bool | None = False) List[Sample[T_co]] [source]¶
Randomly replace num of shots in the samples.
If replace is True, it will skip duplicate checks
- class ClassSampler(dataset: Sequence[T_co], num_classes: int, get_data_key_fun: Callable, default_num_shots: int | None = None)[source]¶
Bases:
Sampler
,Generic
[T_co
]Sample from the dataset based on the class labels.
T_co can be any type of data, e.g., dict, list, etc. with get_data_key_fun to extract the class label.
Example: Initialize
` dataset = [{"coarse_label": i} for i in range(10)] sampler = ClassSampler[Dict](dataset, num_classes=6, get_data_key_fun=lambda x: x["coarse_label"]) `
- random_replace(shots: int, samples: List[Sample], replace: bool | None = False, weights_per_class: List[float] | None = None) Sequence[Sample[T_co]] [source]¶
Randomly select num shots from the samples and replace it with another sample has the same class index