Leveraging AI for Skills Extraction & Research


Skills are the language of education, training, and employment. From course and program descriptions to job postings, skills serve as the language in which we talk to one another about the learning outcomes of education and training programs, and the capabilities sought by firms. Skills have increasingly become the focus of government policy, as leaders grapple with skill-biased technological change and aging working populations and aim to promote equity and inclusion through skill-based hiring.

Educators, employers, learners, and policymakers often struggle to communicate in mutually intelligible ways across the diverse skill languages of their different disciplines, occupations and industries, and regions. This contributes to a wide range of inefficiencies, ranging from inadequate recognition of prior learning, work experience, and “alternative credentials” to poor alignment between the skills cultivated by educators and those prioritized in labor markets.

To address this problem, we aim to make the language of skills more fully intelligible within education and training systems, and between them and firms. We recognize that other initiatives also have this aim, and note that they often create a single, authoritative, and standardized definition of skills that are set within a master taxonomy – a “single source of truth for skills and learning data,” as one EdTech company puts it. Unfortunately, this strategy risks competition among the creators of different master taxonomies, succeeds only with a high degree of commitment among educators and firms to take up this new language, and may require many changes to existing skills descriptors and organizational practices.

Our Goal:

Help learners, educators and employers share trusted and mutually intelligible information about skills.

Our solution is to develop a skills translation tool that works across data environments by creating a flexible architecture for translating skills among data sets. We use artificial intelligence (AI) and machine learning (ML) algorithms to link skills across multiple taxonomies, creating in effect a “taxonomy of taxonomies.” These algorithms permit us to link one set of skills data to another with all the benefits that having a static crosswalk implies, but with the flexibility to link different types of skills data as they evolve. This creates a dynamic system of skill translation using the similarities between taxonomies to link the conceptual clustering of a skill between definitions.

Process: This process is designed to be both dynamic for process-oriented updating and flexible enough to incorporate new data as it becomes available. The ML algorithm underlying our design allows for regular adjustment as data sets evolve without requiring redesign of the core systems, while our AI-based implementation facilitates flexibility and rapid updating of existing linkages. These systems also benefit from additional work being done in this field allowing for new capabilities and features to be added without significant R&D expenditures on underlying algorithms/processes. Rather, we will be able to focus on learning from verified outputs to increase scope and reliability as well as adding additional language capabilities as new systems come online.

Going Forward: We plan to begin testing this system in 2024 with data partners ranging from government, research centers, and private institutions. To achieve this, we are looking for partners to help develop and test our project as well as additional investment to scale our work and accelerate the speed of model improvement. We forecast a robust demand for our product, especially given the current demand for new data analysis capabilities in higher education and the workforce.



dc.gov logo

NOVA logo

 Tecnológico de Monterrey - 2022 INFORMS Annual Meeting