Datasets
Popular Datasets
MINDS-14 is a dataset designed for the intent detection task with spoken data. It encompasses 14 distinct intents extracted from a commercial system in the e-banking domain.
by @AIOZNetwork
0 AIOZ ($0)
LongBench is a comprehensive benchmark for multilingual and multi-task purposes, with the goal to fully measure and evaluate the ability of pre-trained language models to understand long text
by @AIOZNetwork
0 AIOZ ($0)
MathVista: Diverse benchmark for mathematical reasoning in visual contexts. Includes 6,141 examples from 31 datasets.
by @AIOZNetwork
0 AIOZ ($0)
The WikiText language modeling dataset is a collection of over 100 million tokens extracted from the set of verified Good and Featured articles on Wikipedia.
by @AIOZNetwork
0 AIOZ ($0)
The ARC dataset consists of 7,787 science exam questions drawn from a variety of sources, including science questions provided under license by a research partner affiliated with AI2.
by @AIOZNetwork
0 AIOZ ($0)
This is the repository for PLOD Dataset subset being used for CW in NLP module 2023-2024 at University of Surrey.
by @AIOZNetwork
0 AIOZ ($0)
MMLU is a new benchmark designed to measure knowledge acquired during pretraining by evaluating models exclusively in zero-shot and few-shot settings.
by @AIOZNetwork
0 AIOZ ($0)
Cifar-100 is used to train and evaluate image classification models in complex tasks.
by @AIOZNetwork
0 AIOZ ($0)