To assist researchers and policymakers focusing on the determinants and impacts of artificial intelligence (AI) invention, OCE released two data files, collectively called the Artificial Intelligence Patent Dataset (AIPD). The first data file identifies United States (U.S.) patents issued between 1976 and 2020 and pre-grant publications (PGPubs) published through 2020 that contain one or more of several AI technology components (including machine learning, natural language processing, computer vision, speech, knowledge processing, AI hardware, evolutionary computation, and planning and control). OCE generated this data file using a machine learning (ML) approach that analyzed patent text and citations to identify AI in U.S. patent documents (Abood and Feltenberger 2018; Toole et al. 2020). OCE’s approach is based on the methodology of Abood and Feltenberger (2018), but also includes an analysis of patent claims to better identify AI contained in the technical and legal scope of the invention. The second data file contains the patent documents used to train the ML models.