How the datasets are built
This page explains how samples are collected, reviewed, and packaged before they are listed as downloadable bundles.
Overview
- Samples are collected from a defined generator or model family.
- Files are reviewed, cleaned, and checked for duplicates or bad outputs.
- Edge cases are identified and retained intentionally when they add evaluation value.
- Bundles are versioned so updates can be tracked clearly over time.
- Only the files intended for release are included in the final package.
This page is public so buyers can review how the datasets are assembled before purchasing.