Denomination

Denomination is the process of scaling one indicator by another quantity to adjust for the effect of size. This is because many indicators are linked to the unit’s size (economic size, physical size, population, etc.) in one way or another, and if no adjustments were made, a composite indicator would end up with the largest units at the top and the smallest at the bottom. Often, the adjustment is made by dividing the indicator by a so-called “denominator” or a denominating variable. If units are countries, denominators are typically things like GDP, population or land area.

COINr’s Denominate() function allows to quickly perform this operation in a flexible and reproducible way. As with other building functions, it is a generic which means that it has different methods for data frames, coins and purses. They are however all fairly similar.

Data frames

We’ll begin by demonstrating denomination on a data frame. We’ll use the in-built data set to get a small sample of indicators:

library(COINr)

# Get a sample of indicator data (note must be indicators plus a "UnitCode" column)
iData <- ASEM_iData[c("uCode", "Goods", "Flights", "LPI")]
head(iData)
#>   uCode     Goods  Flights      LPI
#> 1   AUT 278.42640 29.01725 4.097985
#> 2   BEL 597.87230 31.88546 4.108538
#> 3   BGR  42.82515  9.23588 2.807685
#> 4   HRV  28.36795  9.24529 3.160829
#> 5   CYP   8.76681  8.75467 2.999061
#> 6   CZE 274.13650 15.30953 3.674309

This is the raw indicator data for three indicators, plus the “uCode” column which identifies each unit. We will also get some data for denominating the indicators. COINr has an in-built set of denominator data called WorldDenoms:

head(WorldDenoms)
#> # A tibble: 6 × 7
#>   uName          uCode           GDP Population    Area  GDPpc Income_Group     
#>   <chr>          <chr>         <dbl>      <dbl>   <dbl>  <dbl> <chr>            
#> 1 Afghanistan    AFG    19291104008.   38041754  652860   507. Low income       
#> 2 Albania        ALB    15279183290.    2854191   27400  5353. Upper middle inc…
#> 3 Algeria        DZA   171091289782.   43053054 2381740  3974. Lower middle inc…
#> 4 American Samoa ASM      636000000       55312     200 11467. Upper middle inc…
#> 5 Andorra        AND     3154057987.      77142     470 40886. High income      
#> 6 Angola         AGO    88815697793.   31825295 1246700  2791. Lower middle inc…

Now, the main things to specify in denomination are which indicators to denominate, and by what. In other words, we need to map the indicators to the denominators. In the example, the export of goods should be denominated by GDP, passenger flight capacity by population (GDP could also possibly be reasonable), and “LPI” (the logistics performance index) is an intensive variable that does not need to be denominated.

This specification is passed to Denominate() using the denomby argument. This takes a data frame which includes “iCode” (the name of the column to be denonimated), “Denominator” (the column name of the denominator data frame to use), and “ScaleFactor” is a multiplying factor to apply if needed. We create this data frame here:

# specify how to denominate
denomby <- data.frame(iCode = c("Goods", "Flights"),
                      Denominator = c("GDP", "Population"),
                      ScaleFactor = c(1, 1000))

A second important consideration is that the rows of the indicators and the denominators need to be matched, so that each unit is denominated by the value corresponding to that unit, and not another unit. Notice that the WorldDenoms data frame covers more or less all countries in the world, whereas the sample indicators only cover 51 countries. The matching is performed inside the Denominate() function, using an identifier column which must be present in both data frames. Here, our common column is “uCode”, which is already found in both data frames. This is also the default column name expected by Denominate(), so we don’t even need to specify it. If you have other column names, use the x_iD and denoms_ID arguments to pass these names to the function.

Ok so now we are ready to denominate:

# Denominate one by the other
iData_den <- Denominate(iData, WorldDenoms, denomby)

head(iData_den)
#>   uCode        Goods     Flights      LPI
#> 1   AUT 6.255713e-10 0.003268788 4.097985
#> 2   BEL 1.121507e-09 0.002776498 4.108538
#> 3   BGR 6.246483e-10 0.001323996 2.807685
#> 4   HRV 4.669422e-10 0.002272966 3.160829
#> 5   CYP 3.513901e-10 0.007304232 2.999061
#> 6   CZE 1.093569e-09 0.001434859 3.674309

The function has matched each unit in iData with its corresponding denominator value in WorldDenoms and divided the former by the latter. As expected, “Goods” and “Flights” have changed, but “LPI” has not because it was not included in the denomby data frame.

Otherwise, the only other feature to mention is the f_denom argument, which allows other functions to be used other than the division operator. See the function documentation.

Coins

Now let’s look at denomination inside a coin. The main difference here is that the information needed to denominate the indicators may already be present inside the coin. When creating the coin using new_coin(), there is the option to specify denominating variables as part of iData (these are variables where iMeta$Type = "Denominator"), and to specify in iMeta the mapping between indicators and denominators, using the iMeta$Type column. To see what this looks like:

# first few rows of the example iMeta, selected cols
head(ASEM_iMeta[c("iCode", "Denominator")])
#>     iCode Denominator
#> 1     LPI        <NA>
#> 2 Flights  Population
#> 3    Ship        <NA>
#> 4    Bord        Area
#> 5    Elec      Energy
#> 6     Gas      Energy

The entries in “Denominator” correspond to column names that are present in iData:

# see names of example iData
names(ASEM_iData)
#>  [1] "uName"         "uCode"         "GDP_group"     "GDPpc_group"  
#>  [5] "Pop_group"     "EurAsia_group" "Time"          "Area"         
#>  [9] "Energy"        "GDP"           "Population"    "LPI"          
#> [13] "Flights"       "Ship"          "Bord"          "Elec"         
#> [17] "Gas"           "ConSpeed"      "Cov4G"         "Goods"        
#> [21] "Services"      "FDI"           "PRemit"        "ForPort"      
#> [25] "Embs"          "IGOs"          "UNVote"        "CostImpEx"    
#> [29] "Tariff"        "TBTs"          "TIRcon"        "RTAs"         
#> [33] "Visa"          "StMob"         "Research"      "Pat"          
#> [37] "CultServ"      "CultGood"      "Tourist"       "MigStock"     
#> [41] "Lang"          "Renew"         "PrimEner"      "CO2"          
#> [45] "MatCon"        "Forest"        "Poverty"       "Palma"        
#> [49] "TertGrad"      "FreePress"     "TolMin"        "NGOs"         
#> [53] "CPI"           "FemLab"        "WomParl"       "PubDebt"      
#> [57] "PrivDebt"      "GDPGrow"       "RDExp"         "NEET"

So in our example, all the information needed to denominate is already present in the coin - the denominator data, and the mapping. In this case, to denominate, we simply call:

# build example coin
coin <- build_example_coin(up_to = "new_coin", quietly = TRUE)

# denominate (here, we only need to say which dset to use)
coin <- Denominate(coin, dset = "Raw")
#> Written data set to .$Data$Denominated

If the denomination data and/or mapping isn’t present in the coin, or we wish to try an alternative specification, we can also pass this to Denominate() using the denoms and denomby arguments as in the previous section.

This concludes the main features of Denominate() for a coin. Before moving on, consider that denomination needs extra care because it radically changes the indicator. It is a nonlinear transformation because each data point is divided by a different value. To demonstrate, consider the “Flights” indicator that we just denominated - let’s plot the raw indicator against the denominated version:

plot_scatter(coin, dsets = c("Raw", "Denominated"), iCodes = "Flights")

This shows that the raw and denominated indicators show very little resemblance to one another.

Purses

The final method for Denominate() is for purses. The purse method is exactly the same as the coin method, except applied to a purse.

An important consideration here is that denominator variables can and do vary with time, just like indicators. This means that e.g. “Total value of exports” from 2019 should be divided by GDP from 2019, and not from another year. In other words, denominators are panel data just like the indicators.

This is why denominators are ideally input as part of iData when calling new_coin(). In doing so, denominators are another column of the data frame like the indicators, and must have an entry for each unit/time pair. This also ensures that the unit-matching of denominator and indicator is correct (or more accurately, I leave that up to you!).

In our example purse, the denominator data is already included, as is the mapping. This means that denomination is exactly the same operation as denominating a coin:

# build example purse
purse <- build_example_purse(up_to = "new_coin", quietly = TRUE)

# denominate using data/specs already included in coin
purse <- Denominate(purse, dset = "Raw")
#> Written data set to .$Data$Denominated
#> Written data set to .$Data$Denominated
#> Written data set to .$Data$Denominated
#> Written data set to .$Data$Denominated
#> Written data set to .$Data$Denominated

In fact if you try to pass denominator data to Denominate() for a purse via denoms, there is a catch: at the moment, denoms does not support panel data, so it is required to use the same value for each time point. This is not ideal and may be sorted out in future releases. For now, it is better to denominate purses by passing all the specifications via iData and iMeta when building the purse with new_coin().