Efficient and Robust Model Benchmarks with Item Response Theory and Adaptive Testing

Peter Flach; Hao Song

doi:10.9781/ijimai

Back

Efficient and Robust Model Benchmarks with Item Response Theory and Adaptive Testing

Journal article

Open access

Efficient and Robust Model Benchmarks with Item Response Theory and Adaptive Testing

Peter Flach and Hao Song

International Journal of Interactive Multimedia and Artificial Intelligence, Vol. Vol. 6 (No. 5 )

2021

DOI: https://doi.org/10.9781/ijimai

Abstract

Progress in predictive machine learning is typically measured on the basis of performance comparisons on benchmark datasets. Traditionally these kinds of empirical evaluation are carried out on large numbers of datasets, but this is becoming increasingly hard due to computational requirements and the often large number of alternative methods to compare against. In this paper we investigate adaptive approaches to achieve better efficiency on model benchmarking. For a large collection of datasets, rather than training and testing a given approach on every individual dataset, we seek methods that allow us to pick only a few representative datasets to quantify the model’s goodness, from which to extrapolate to performance on other datasets. To this end, we adapt existing approaches from psychometrics: specifically, Item Response Theory and Adaptive Testing. Both are well-founded frameworks designed for educational tests. We propose certain modifications following the requirements of machine learning experiments, and present experimental results to validate the approach.

Files and links (2)

pdf

Full publication PDF1.48 MBDownload View

CC BY V4.0, Open Access

url

doi.org/10.9781/ijimaiView

Metrics

1 Record Views

Details

Title: Efficient and Robust Model Benchmarks with Item Response Theory and Adaptive Testing
Creators - without role: Peter Flach
Hao Song
Publication Details: International Journal of Interactive Multimedia and Artificial Intelligence, Vol. Vol. 6 (No. 5 )
Identifiers: 999197209548
Academic Unit: The Alan Turing Institute
Language: English
Resource Type: Journal article
Date published: 2021

Efficient and Robust Model Benchmarks with Item Response Theory and Adaptive Testing

Abstract

Files and links (2)

Metrics

Details

The Alan Turing Institute Social media