All datasets

The WikiText language modeling dataset is a collection of over 100 million tokens extracted from the set of verified Good and Featured articles on Wikipedia.

1
cc-by-sa-4.0
n<1k
Text Generation
Fill-Mask
English

Last updated: 6 days ago

placeholder for img

Total downloads

1

Created: July 12, 2024