Hobbling Computer Vision Datasets Against Unauthorized Use

Researchers from China have developed a method to copyright-protect image datasets used for computer vision training, by effectively ‘watermarking’ the images in the data, and then decrypting the ‘clean’ images via a cloud-based platform for authorized users only.

Tests on the system show that training a machine learning model on the copyright-protected images causes a catastrophic drop in model accuracy. Testing the system on two popular open source image datasets, the researchers found it was possible to drop accuracies from 86.21% and 74.00% for the clean datasets down to 38.23% and 16.20% when attempting to train models on the non-decrypted data.

