isco-crosswalks

library(iscoCrosswalks)
library(data.table)

Introduction

Eurostat, CEDEFOP, and other EU agencies have been looking into using the ISCO classification to link job–worker characteristics like skill importance and level ratings to European jobs. O*NET is a reliable source of occupational data for the European labour market because of its strong theoretical and empirical basis and the similarities between the European and US economic systems. Drawing on O*NET necessitates mapping from one classification to another since the O*NET database classifies jobs differently than the EU. This is accomplished via a “concordance” (also known as a “crosswalk” or “correspondence table”). We propose the a publicly accessible software for methodologically transparent concordances, by constructing an R-package that we hope will greatly reduce the expenses and time required to perform an approximate mapping between the two categorization systems.

Understanding the variations in how jobs are organized in the US and the EU is necessary to realize the necessity for a more granular concordance. While the SOC and the ISCO have similar goals, there are significant distinctions in the organization and information included in occupational profiles. The US SOC has 867 6-digit classifications for statistics reporting (e.g., number of people employed or average wage). The EU ISCO, on the other hand, employs the ISCO 4-digit categorization, which has 436 unit groups, and adds an extra layer of information of approximately 3000 occupational distinctions. Both systems group occupations according to a four-level hierarchy, with ISCO adding an extra degree of detail.

Crosswalks

This package introduces an alogithm that conducts approximate matching between the ISCO and SOC classifications using concordances provided by the Institute for Structural Research and Faculty of Economics, University of Warsaw. The crosswalks offer a complete step-by-step mapping of O*NET data to ISCO88 and ISCO-08 coding using an expanded version of SOC-00 and SOC-10 coding. We propose a mapping method based on the aforementioned research that converts measurements to the smallest possible unit of the target taxonomy, and then performs an aggregation/estimate to the appropriate degree of detail.

“In case of raw-count data the sum by occupation group is used, else for composite indicators the mean value is used by each occupational group of the target hierarchical level.”

Example (ISCO-SOC)

Example

This is a basic example which shows you how to translate CEDEFOPs “Importance of foundation skills” indicator given in ISCO(2008) to SOC(2010) classification:

library(iscoCrosswalks)

The percentage of jobs where foundation skills (literacy, numeracy, ICT, and foreign languages) are highly crucial for doing the work is shown in this indicator. It is based on the findings of Cedefop’s European survey of skills and jobs.

The Skills Foundation Indicator is exposed also in iscoCrosswalks as an example data-set. It consists of three variables

To perform the transformation, we’ve added a third column with the preferredLabel from the ISCO taxonomy. In the R terminal, type isco to access the desired labels. Manual entry of preferred labels is suggested for small data. See also the R package labourR for automating the occupations coding, in case of big data-sets.

Inspecting the indicator,

kable(foundation_skills[seq(1 , nrow(foundation_skills), by = 5), ])
Occupations preferredLabel Skill Value
Managers Managers Foreign language skills 10.10
Professionals Professionals ICT 86.28
Associate professionals Technicians and associate professionals Literacy 59.06
Clerks Clerical support workers Numeracy 26.30
Farm and related workers Skilled agricultural, forestry and fishery workers Foreign language skills 1.78
Trades workers Craft and related trades workers ICT 34.80
Operators and assemblers Plant and machine operators and assemblers Literacy 18.93
Elementary workers Elementary occupations Numeracy 7.20

To translate the indicator to SOC classification, iscoCrosswalks has two mandatory column names. Namely, job and value standing for the preferred labels of the taxonomy and the value of the indicator respectively.

Thus, we rename preferredLabel to job, and Value to value.

data.table::setnames(foundation_skills,
                     c("preferredLabel", "Value"),
                     c("job", "value"))

The isco_soc_crosswalk() function can translate the values to the desired taxonomy. The parameter brkd_cols accepts a vector that indicates other columns used for grouping.

Also, since this is a composite score we set indicator = TRUE to use mean value. Instead, if raw counts are given then we set indicator = FALSE to aggregate the units of the hierarchy.

soc_foundation_skills <- isco_soc_crosswalk(foundation_skills,
                                            brkd_cols = "Skill",
                                            isco_lvl = 1,
                                            soc_lvl = "soc_1",
                                            indicator = TRUE)

In the following table we visualize the top 6 occupations by Skill, of the projected indicator to the SOC taxonomy.

soc_foundation_skills[, Occupations := gsub(" Occupations", "", soc_label)]
soc_foundation_skills[, Skill := gsub(" skills", "", Skill)]
dat <- soc_foundation_skills[order(Skill, -value)][, head(.SD, 6), by = "Skill"]
kable(dat)
Skill soc10 soc_label value Occupations
Foreign language 250000 Education, Training, and Library Occupations 11.132354 Education, Training, and Library
Foreign language 210000 Community and Social Service Occupations 10.843333 Community and Social Service
Foreign language 190000 Life, Physical, and Social Science Occupations 10.635116 Life, Physical, and Social Science
Foreign language 150000 Computer and Mathematical Occupations 10.538421 Computer and Mathematical
Foreign language 170000 Architecture and Engineering Occupations 10.196286 Architecture and Engineering
Foreign language 290000 Healthcare Practitioners and Technical Occupations 9.983111 Healthcare Practitioners and Technical
ICT 110000 Management Occupations 86.258177 Management
ICT 210000 Community and Social Service Occupations 85.373333 Community and Social Service
ICT 250000 Education, Training, and Library Occupations 85.127302 Education, Training, and Library
ICT 190000 Life, Physical, and Social Science Occupations 85.014884 Life, Physical, and Social Science
ICT 150000 Computer and Mathematical Occupations 84.848421 Computer and Mathematical
ICT 170000 Architecture and Engineering Occupations 84.259429 Architecture and Engineering
Literacy 250000 Education, Training, and Library Occupations 73.677434 Education, Training, and Library
Literacy 210000 Community and Social Service Occupations 72.601667 Community and Social Service
Literacy 190000 Life, Physical, and Social Science Occupations 71.530930 Life, Physical, and Social Science
Literacy 110000 Management Occupations 71.420471 Management
Literacy 150000 Computer and Mathematical Occupations 71.033684 Computer and Mathematical
Literacy 170000 Architecture and Engineering Occupations 69.274286 Architecture and Engineering
Numeracy 110000 Management Occupations 48.475647 Management
Numeracy 250000 Education, Training, and Library Occupations 41.533545 Education, Training, and Library
Numeracy 210000 Community and Social Service Occupations 41.216667 Community and Social Service
Numeracy 190000 Life, Physical, and Social Science Occupations 40.717209 Life, Physical, and Social Science
Numeracy 150000 Computer and Mathematical Occupations 40.485263 Computer and Mathematical
Numeracy 170000 Architecture and Engineering Occupations 39.664571 Architecture and Engineering

If the reverse process is required, use the soc_isco_crosswalk() function. The preffered labels of the taxonomy can be inspected in the included dataset soc_groups.