data

machine learning

I maintain mirrors of several machine learning datasets, including MNIST (both the standard and "fashion" versions), CIFAR-10, and a number of those used by the Rust-ML linfa-datasets library. I am also a contributor and author, respectively, of the Rust mnist and cifar-ten crates. Please feel free to use these mirrors, but try not to crush them with busy CI pipelines or something.

mnist

standard

    train-images-idx3-ubyte.gz
    train-labels-idx1-ubyte.gz
    t10k-images-idx3-ubyte.gz
    t10k-labels-idx1-ubyte.gz

fashion

    train-images-idx3-ubyte.gz
    train-labels-idx1-ubyte.gz
    t10k-images-idx3-ubyte.gz
    t10k-labels-idx1-ubyte.gz

cifar-10

    cifar-10-binary.tar.gz

linfa-datasets

    diabetes_data.csv.gz
    diabetes_target.csv.gz
    iris.csv.gz
    linnerud_exercise.csv.gz
    linnerud_physiological.csv.gz
    winequality-red.csv.gz

drone imagery

Heres's some imagery I've recorded from a DJI Mini 2. The photos are raw .jpg files, or low-resolution .webp / high-resolution .png orthomosaics constructed using OpenDroneMap.

A georeferenced map of orthomosaics data can be found here.

    Arroyo Quemado minor, 2021-08-27: mosaic (31 MB), stitched using OpenCV
    Tarp in a field: 9.9m, 20m
    Mashapaug Pond, 2022-03-10: .webp (62 KB), .png (15 MB)
    Donnelly Land Preserve, 2022-03-10: .webp (3.4 MB), .png (424 MB)
Unless otherwise specified, all writing and imagery (besides the machine learning datasets) on this site is licensed under CC-BY, and the source code under AGPL.