4. Fantastic Groups and how to use them

Groupings are a big emphasis in sftrack. Structurally they are built with in the same vein as sfc and sfg classes in sf.

To begin an s_group is a singular grouping. Its whats stored at the row level. A c_grouping is a collection of s_groups and exists at the column level. Groups also have an active_group argument, which turns on and off certain groups for analysis and plotting purposes.

We start by looking at the structure of a c_grouping

library("sftrack")
data("raccoon", package = "sftrack")
#raccoon <- read.csv(system.file("extdata/raccoon_data.csv", package="sftrack"))
group_list <- list(id = raccoon$animal_id, month = as.POSIXlt(raccoon$timestamp)$mon + 1)

cg1 <- make_c_grouping(x = group_list, active_group = c("id", "month"))
str(cg1)
## c_grouping object 
## List of 445
##  $ :List of 2
##   ..$ id   : chr "TTP-058"
##   .. [list output truncated]
##   ..- attr(*, "class")= chr "s_group"
##   [list output truncated]
##  - attr(*, "active_group")= chr [1:2] "id" "month"
##  - attr(*, "sort_index")= Factor w/ 4 levels "TTP-041_1","TTP-041_2",..: 3 3 3 3 3 3 3 3 3 3 ...
cg1[[1]]
## $id
## [1] "TTP-058"
## 
## $month
## [1] "1"

A grouping contains group related information. The id of the subject/sensor is the lowest level of grouping for the data. Any additional grouping variables are optional.

A c_grouping is simply a collection of s_groups. The s_group is where the grouping data is stored, and can be modified at the row level. The s_groups main job is to store row level grouping information and maintain consistency of the grouping variables.

Single groups (s_group)

An s_group is the grouping variables for a single row of data.

You can make an s_group object using make_s_group(), and giving it a list with the group variables named. In this example we have a single sensor named ‘TTP_058’ from a raccoon, and an additional grouping variable of month (entered as its numeric interpretation).

All grouping information is converted and stored as a character in the s_group.

singlegroup <- make_s_group(list(id = "TTP_058", month = 4))
str(singlegroup)
## List of 2
##  $ id   : chr "TTP_058"
##  $ month: chr "4"
##  - attr(*, "class")= chr "s_group"

Because s_groups are simply lists, you can edit individual elements in an s_group

singlegroup 
## $id
## [1] "TTP_058"
## 
## $month
## [1] "4"
singlegroup[1] <- "CJ15"
singlegroup$month <- "5"
str(singlegroup)
## List of 2
##  $ id   : chr "CJ15"
##  $ month: chr "5"
##  - attr(*, "class")= chr "s_group"

Column groupings (c_grouping)

C_groupings are a collection of s_groups with the same grouping names and an ‘active_group’ which is a subset of all the available group names.

Similarly to s_group you can make a c_grouping with make_s_group. The argument x takes a list where each element is a vector indicating the named groupings as well as a vector of the active groups.

group_list <- list(id = rep(1:2, 10), year = rep(2020, 10))
cg <- make_c_grouping(x = group_list, active_group = c("id", "year"))
str(cg)
## c_grouping object 
## List of 20
##  $ :List of 2
##   ..$ id  : chr "1"
##   .. [list output truncated]
##   ..- attr(*, "class")= chr "s_group"
##   [list output truncated]
##  - attr(*, "active_group")= chr [1:2] "id" "year"
##  - attr(*, "sort_index")= Factor w/ 2 levels "1_2020","2_2020": 1 2 1 2 1 2 1 2 1 2 ...

You can also make a c_grouping by concatenating multiple s_groups. All s_groups must have the same names or an error is returned.

a <- make_s_group(list(id = 1, year = 2020))
b <- make_s_group(list(id = 1, year = 2021))
c <- make_s_group(list(id = 2, year = 2020))
cg <- c(a, b , c)
## 1_2020 & 1_2021 & 2_2020 has only one relocation
summary(cg)
##                 1_2020                 1_2021                 2_2020 
##                      1                      1                      1 
## active_group: id, year 
##                      0

You’ll notice sftrack warns if any grouping combination only has one relocation. This may be relevant for different kinds of analysis where you need more than one point to consider a movement.

You can also combine c_grouping together with c(). All names and active_groups must be the same.

cg_combine <- c(cg,cg)
summary(cg_combine)
##                 1_2020                 1_2021                 2_2020 
##                      2                      2                      2 
## active_group: id, year 
##                      0

You can also edit groupings like a list, but you must replace it with an s_group object.

cg[1]
## [[1]]
## $id
## [1] "1"
## 
## $year
## [1] "2020"
cg[1] <- make_s_group(list(id = 3, year = 2019))
## 1_2021 & 2_2020 & 3_2019 has only one relocation
cg[1]
## [[1]]
## $id
## [1] "3"
## 
## $year
## [1] "2019"

And the group names must match the ones in the c_grouping or an error is returned

# Try to add an s_group with a month field when the original group had year instead 
try( cg[1] <- make_s_group(list(id = 3, month = 2019)) )
## Error in check_group_names(ret) : Group names do not match

Selecting a grouping

As c_groupings are stored as lists, it can be difficult to refer to a single group or combination of groupings. This is where the sort_index can come in handy. The sort_index is a factor of the combined active_group variables for each row where paste(id, name1,name2,…,sep=’_’)). The sort index is remade every time the active group changes, and therefore can be used to subset. You can access the sort index using group_labels().

group_list <- list(id = rep(1:2, 10), year = rep(2020, 10))
cg <- make_c_grouping(x = group_list, active_group = c("id", "year"))
group_labels(cg)[1:10]
##  [1] 1_2020 2_2020 1_2020 2_2020 1_2020 2_2020 1_2020 2_2020 1_2020 2_2020
## Levels: 1_2020 2_2020
# Subsetting a particular sensor from our raccoon data

data("raccoon", package = "sftrack")
raccoon$month <- as.POSIXlt(raccoon$timestamp)$mon + 1

raccoon$time <- as.POSIXct(raccoon$timestamp, tz = "EST")
coords <- c("longitude","latitude")
group <- list(id = raccoon$animal_id, month = as.POSIXlt(raccoon$timestamp)$mon+1)
time <- "time"

my_sftraj <- as_sftraj(data = raccoon, coords = coords, group = group, time = time)
head(my_sftraj[group_labels(my_sftraj) %in% c("TTP-058_1"), ])
## Sftraj with 6 features and 12 fields (3 empty geometries) 
## Geometry : "geometry" (XY, crs: NA) 
## Timestamp : "time" (POSIXct in UTC) 
## Grouping : "sft_group" (*id*, *month*) 
## -------------------------------
##   animal_id latitude longitude           timestamp height hdop vdop fix month
## 1   TTP-058       NA        NA 2019-01-19 00:02:30     NA  0.0  0.0  NO     1
## 2   TTP-058 26.06945 -80.27906 2019-01-19 01:02:30      7  6.2  3.2  2D     1
## 3   TTP-058       NA        NA 2019-01-19 02:02:30     NA  0.0  0.0  NO     1
## 4   TTP-058       NA        NA 2019-01-19 03:02:30     NA  0.0  0.0  NO     1
## 5   TTP-058 26.06769 -80.27431 2019-01-19 04:02:30    858  5.1  3.2  2D     1
## 6   TTP-058 26.06867 -80.27930 2019-01-19 05:02:30    350  1.9  3.2  3D     1
##                  time               sft_group                       geometry
## 1 2019-01-19 00:02:30 (id: TTP-058, month: 1)                    POINT EMPTY
## 2 2019-01-19 01:02:30 (id: TTP-058, month: 1)     POINT (-80.27906 26.06945)
## 3 2019-01-19 02:02:30 (id: TTP-058, month: 1)                    POINT EMPTY
## 4 2019-01-19 03:02:30 (id: TTP-058, month: 1)                    POINT EMPTY
## 5 2019-01-19 04:02:30 (id: TTP-058, month: 1) LINESTRING (-80.27431 26.06...
## 6 2019-01-19 05:02:30 (id: TTP-058, month: 1) LINESTRING (-80.2793 26.068...

You can also subset by entering the group label of the group itself in either the c_grouping or the sftrack/sftraj object:

head(cg["1_2020"])
## [[1]]
## $id
## [1] "1"
## 
## $year
## [1] "2020"
## 
## 
## [[2]]
## $id
## [1] "1"
## 
## $year
## [1] "2020"
## 
## 
## [[3]]
## $id
## [1] "1"
## 
## $year
## [1] "2020"
## 
## 
## [[4]]
## $id
## [1] "1"
## 
## $year
## [1] "2020"
## 
## 
## [[5]]
## $id
## [1] "1"
## 
## $year
## [1] "2020"
## 
## 
## [[6]]
## $id
## [1] "1"
## 
## $year
## [1] "2020"
sub <- my_sftraj["TTP-058_1", ]
print(sub, 5, 3)
## Sftraj with 207 features and 12 fields (64 empty geometries) 
## Geometry : "geometry" (XY, crs: NA) 
## Timestamp : "time" (POSIXct in UTC) 
## Grouping : "sft_group" (*id*, *month*) 
## -------------------------------
##   animal_id latitude longitude ...               sft_group
## 1   TTP-058       NA        NA ... (id: TTP-058, month: 1)
## 2   TTP-058 26.06945 -80.27906 ... (id: TTP-058, month: 1)
## 3   TTP-058       NA        NA ... (id: TTP-058, month: 1)
## 4   TTP-058       NA        NA ... (id: TTP-058, month: 1)
## 5   TTP-058 26.06769 -80.27431 ... (id: TTP-058, month: 1)
##                         geometry                time
## 1                    POINT EMPTY 2019-01-19 00:02:30
## 2     POINT (-80.27906 26.06945) 2019-01-19 01:02:30
## 3                    POINT EMPTY 2019-01-19 02:02:30
## 4                    POINT EMPTY 2019-01-19 03:02:30
## 5 LINESTRING (-80.27431 26.06... 2019-01-19 04:02:30

You can refer to the levels of the sort index using group_names, this returns levels() of the sort index.

group_names(cg)
## [1] "1_2020" "2_2020"

Active group

The active_group is a simple yet powerful feature. It dictates how your data is grouped for essentially all calculations. It can also be changed on the fly. You can view and change the active group of a column grouping with active_group(). Once changed, it recalculates the sort_index and in some cases recalculates the geometries.

Active groups can be changed for any sftrack/straj/c_grouping

# sftrack
active_group(my_sftraj)
## [1] "id"    "month"
summary(my_sftraj, stats = TRUE)
##       group points NAs          begin_time            end_time     length_m
## 1 TTP-041_1    208   0 2019-01-19 00:02:30 2019-01-31 23:02:30 0.0533572052
## 2 TTP-041_2     15   0 2019-02-01 00:02:30 2019-02-01 23:02:07 0.0001556664
## 3 TTP-058_1    207   0 2019-01-19 00:02:30 2019-01-31 23:02:30 0.1779383524
## 4 TTP-058_2     15   0 2019-02-01 00:02:30 2019-02-01 23:02:30 0.0115635197
# change the active group to id only
active_group(my_sftraj) <- c("id")
active_group(my_sftraj)
## [1] "id"
summary(my_sftraj, stats = TRUE)
##     group points NAs          begin_time            end_time   length_m
## 1 TTP-041    223   0 2019-01-19 00:02:30 2019-02-01 23:02:07 0.05351287
## 2 TTP-058    222   0 2019-01-19 00:02:30 2019-02-01 23:02:30 0.18950187
# column groupings work the same way
active_group(cg)
## [1] "id"   "year"
active_group(cg) <- "id"
active_group(cg)
## [1] "id"