VS1.2 - A Real-Life Example: Swamp Tree Data

Elvan Ceyhan

2023-12-18

First we load the pcds package:

library(pcds)

1 Illustration of PCDs on Swamp Tree Data Set

We also illustrate the PCD-based methodology using a forestry dataset (namely, swamp tree data). Good and Whipple (1982) considered the spatial patterns of tree species along the Savannah River, South Carolina, U.S.A. From this data, Dixon (2002) used a single 50m \(\times\) 200m rectangular plot to illustrate his nearest neighbor contingency table methods. All live or dead trees with 4.5 cm or more dbh (diameter at breast height) were recorded together with their species. Hence, it is an example of a realization of a marked multi-variate point pattern. The plot contains 13 different tree species, four of which comprising over 90% of the 734 tree stems. This dataset consists of the following variables: \(x\) and \(y\) coordinates of the locations of the trees, liveness status (as 1 for “live” trees and 0 for “dead” trees), and the species of the trees.

We investigate the spatial interaction between live trees and dead trees where live trees are taken to be the \(\mathcal{X}\) points, while dead trees are taken to be the \(\mathcal{Y}\) points. The reason for this choice is that the live trees are much more abundant than dead trees, and the tests based on PCDs are more appropriate when the digraph is constructed with vertices from the more abundant class or species; hence Delaunay triangulation is based on the locations of dead trees. With these class designations, one can replicate the analysis in Section “VS1_1_2DArtiData”. The study area contains 630 live and 104 dead trees. See also Figure 1.1 which is suggestive of association of live trees with dead trees.

First, we read in the swamp tree dataset, and print the first 6 of them.

trees<-swamptrees
head(trees)
#>     x   y live sp
#> 1 1.5 6.1    1 TD
#> 2 3.1 3.9    0 FX
#> 3 3.0 3.8    0 FX
#> 4 3.1 3.4    1 FX
#> 5 4.8 6.9    1 FX
#> 6 4.5 3.9    1 NX
Xp<-trees[trees[,3]==1,][,1:2] # coordinates of all live trees
Yp<-trees[trees[,3]==0,][,1:2] # coordinates of all dead trees

Next, we present the scatterplot of (the locations of the) live trees (red circles) and dead trees (black squares) in the swamp tree dataset.

lab.fac=as.factor(trees$live)
lab=as.numeric(trees$live)
plot(trees[,1:2],col=lab+1,pch=lab,xlab="x",ylab="y",main="Scatter plot of live and dead trees")
The scatterplot of the Live Trees (red circles) and Dead Trees (black squares) in the Swamp Tree Dataset.

Figure 1.1: The scatterplot of the Live Trees (red circles) and Dead Trees (black squares) in the Swamp Tree Dataset.

The PCDs will be constructed with vertices being live trees and the binary relation that determines the arcs are based on proximity regions which depend on the Delaunay triangulation of dead trees. Below is the code for the scatterplot of the live trees together with the Delaunay triangulation of dead trees.

Xlim<-range(Xp[,1],Yp[,1])
Ylim<-range(Xp[,2],Yp[,2])
xd<-Xlim[2]-Xlim[1]
yd<-Ylim[2]-Ylim[1]
plot(Xp,xlab="x", ylab="y",xlim=Xlim+xd*c(-.05,.05),
     ylim=Ylim+yd*c(-.05,.05),pch=".",cex=3,main="Live Trees (solid squares) and Delaunay
     Triangulation of Dead Treess")
#now, we add the Delaunay triangulation based on Y points
DT<-interp::tri.mesh(Yp[,1],Yp[,2],duplicate="remove")
interp::plot.triSht(DT, add=TRUE, do.points = TRUE)

Or, alternatively, we can use the plotDelaunay.tri function in pcds to obtain the plot of the Delaunay triangulation by executing the line: plotDelaunay.tri(Xp,Yp,xlab="x",ylab="y",main="Live Tree Locations (solid squares) and Delaunay Triangulation of Dead Tree Locations").

The number of Delaunay triangles based on dead trees is obtained by the function num.delaunay.tri.

num.delaunay.tri(Yp)
#> [1] 194

1.1 Summary and Visualization with Arc-Slice PCD

Number of arcs of the AS-PCD can be computed by the function num.arcsAS whose arguments and output is described in Section “VS1_1_2DArtiData”. Here, only the number of arcs is presented, the other parts of the output are suppressed for brevity; to see them, just use narcs in the last line instead of narcs$num.arcs:

M<-"CC" #try also M<-c(1,1,1) #or M<-c(1,2,3)
Narcs=num.arcsAS(Xp,Yp,M)
Narcs$num.arcs
#> [1] 1849

#summary(Narcs)
#plot(Narcs)

As in Section “VS1_1_2DArtiData”,

1.2 Summary and Visualization with Proportional Edge PCDs

Number of arcs of the PE-PCD can be computed by the function num.arcsPE whose arguments and output is described in Section “VS1_1_2DArtiData”. Here, only the number of arcs is presented, the other parts of the output are suppressed for brevity; to see them, just use arcs in the last line:

M<-c(1,1,1) #try also M<-c(1,2,3) #or M<-"CC"
r<-1.5 #try also r<-2

Narcs=num.arcsPE(Xp,Yp,r,M)
Narcs$num.arcs
#> [1] 1429

#summary(Narcs)
#plot(Narcs)

As in the AS-PCD case above

1.2.1 Testing Spatial Interaction with the PE-PCDs

We can test the spatial pattern or interaction of segregation/association based on arc density or domination number of PE-PCDs.

We can test the spatial pattern or interaction of segregation/association based on arc density of PE-PCD using the function PEarc.dens.test (see Section “VS1_1_2DArtiData”, for details):

PEarc.dens.test(Xp,Yp,r) #try also PEarc.dens.test(Xp,Yp,r,alt="l") or #PEarc.dens.test(Xp,Yp,r,ch=TRUE)

#>  Large Sample z-Test Based on Arc Density of PE-PCD for Testing Uniformity of 2D Data ---
#>  without Convex Hull Correction
#>
#> data:  Xp
#> standardized arc density (i.e., Z) = -1.7333, p-value = 0.08304
#> alternative hypothesis: true (expected) arc density is not equal to 0.005521555
#> 95 percent confidence interval:
#>  0.003780462 0.005628405
#> sample estimates:
#> arc density
#> 0.004704434 

We can test the spatial pattern or interaction of segregation/association based on domination of PE-PCD using the functions PEdom.num.binom.test and PEdom.num.norm.test. We first provide two functions PEdom.num and PEdom.num.nondeg to compute the domination number of PE-PCDs (see Section “VS1_1_2DArtiData”, for details):

PEdom.num(Xp,Yp,r,M)
#> $dom.num
#> [1] 198
#>
#> $ind.mds
#>   [1]   8   4  18   6  16   7  11  67  64  17  14  78  82  75  19  60  83  88  87  27  23  28  29  54  94  80 154 148  95 153
#>  [31] 152  47  30 149 144 147  50 101  45  48 143 173 105 109  31  46 117  44  79 168 169 134 140 139 128 113 172 165 161 160
#>  [61] 177 219 187 214 209 126 220 201 208 183 188 125 225 226 224 223 210 213 242 247 200 196 250 282  43 123 241 243 294 258
#>  [91] 283 284 281 289 288 300 304 298 306 308 286 321 278 277 311 314 313 312 276 273 319 317 318 316 320 363 364 339 369 333
#> [121] 377 331 307 345 346 385 384 379 380 328 375 373 374 383 410 408 414 388 425 423 235 303 398 355 396 381 434 433 460 457
#> [151] 412 194 329 193 445 458 507 442 514 464 482 517 519 506 541 573 530 532 562 539 491 500 490 533 568 566 590 586 578 569
#> [181] 588 589 610 565 593 591 557 559 594 602 601 550 555 607 609 614 616 611
#>
#> $tri.dom.nums
#>   [1] 0 1 0 1 1 0 1 0 3 0 0 0 0 0 2 0 2 0 1 1 1 0 0 0 0 1 1 0 0 0 0 1 2 0 0 0 1 0 0 1 2 1 0 1 1 0 1 0 1 1 0 1 1 1 1 0 3 2 2 0 0
#>  [62] 0 2 0 1 1 0 0 2 0 1 1 0 1 0 2 1 2 2 1 1 2 0 1 0 0 0 0 0 1 1 0 0 1 1 1 1 0 2 1 1 1 0 3 1 1 1 2 0 2 2 2 2 1 0 1 0 2 3 1 2 2
#> [123] 1 0 1 2 2 2 0 2 0 0 0 0 2 2 0 1 2 1 1 1 2 3 2 1 0 1 1 3 1 3 1 2 1 1 3 3 3 3 1 1 0 1 1 3 2 2 0 2 1 2 3 1 1 0 0 2 2 2 1 1 0
#> [184] 3 0 2 1 0 2 2 1 1 3 0
#>
#> PEdom.num.nondeg(Xp,Yp,r)
#> $dom.num
#> [1] 198
#>
#> $ind.mds
#>   [1]   8   4  18   6  11  16   7  67  64  65  14  78  82  75  19  60  83  87  88  27  23  29  28  54  94  80 154 148  95 153
#>  [31] 152  47  30 147 144 149 101  51  48  45 173 143 105 109  32  46 118  44  79 171 169 134 139 140 113 128 172 165 161 160
#>  [61] 177 219 187 214 209 126 220 208 201 183 188 125 226 225 224 223 210 213 218 242 200 196 250 282 123  43 243 241 294 258
#>  [91] 283 284 289 281 288 300 304 298 306 308 286 321 278 277 311 314 313 312 276 273 319 317 316 318 320 364 363 339 369 333
#> [121] 331 377 346 345 307 385 384 379 380 328 373 375 374 383 410 414 408 388 423 425 235 303 396 398 355 381 433 434 412 457
#> [151] 460 329 193 194 445 458 507 442 464 482 514 517 519 506 541 573 530 532 562 539 490 491 500 533 568 566 590 586 578 569
#> [181] 588 589 610 591 565 593 559 557 594 601 602 555 550 607 609 611 614 616
#>
#> $tri.dom.nums
#>   [1] 0 1 0 1 1 0 1 0 3 0 0 0 0 0 2 0 2 0 1 1 1 0 0 0 0 1 1 0 0 0 0 1 2 0 0 0 1 0 0 1 2 1 0 1 1 0 1 0 1 1 0 1 1 1 1 0 3 2 2 0 0
#>  [62] 0 2 0 1 1 0 0 2 0 1 1 0 1 0 2 1 2 2 1 1 2 0 1 0 0 0 0 0 1 1 0 0 1 1 1 1 0 2 1 1 1 0 3 1 1 1 2 0 2 2 2 2 1 0 1 0 2 3 1 2 2
#> [123] 1 0 1 2 2 2 0 2 0 0 0 0 2 2 0 1 2 1 1 1 2 3 2 1 0 1 1 3 1 3 1 2 1 1 3 3 3 3 1 1 0 1 1 3 2 2 0 2 1 2 3 1 1 0 0 2 2 2 1 1 0
#> [184] 3 0 2 1 0 2 2 1 1 3 0 

We can test the spatial pattern with (the binomial approximation of the) domination number of the PE-PCDs with the function PEdom.num.binom.test.

PEdom.num.binom.test(Xp,Yp,r) #try also PEdom.num.binom.test(Xp,Yp,r,alt="g")

#>  Large Sample Binomial Test based on the Domination Number of PE-PCD for Testing Uniformity of 2D
#>  Data ---
#>  without Convex Hull Correction
#>
#> data:  Xp
#> #(domination number is <= 2) = 179, p-value = 1.921e-10
#> alternative hypothesis: true Pr(Domination Number <=2) is not equal to 0.7413
#> 95 percent confidence interval:
#>  0.8756797 0.9560803
#> sample estimates:
#>           domination number   || Pr(domination number <= 2)
#>                   198.0000000                     0.9226804 

We can test the spatial pattern with (the normal approximation for the) domination number of the PE-PCDs with the function PEdom.num.norm.test.

PEdom.num.norm.test(Xp,Yp,r) #try also PEdom.num.norm.test(Xp,Yp,r,alt="g")

#>  Normal Approximation to the Domination Number of PE-PCD for Testing Uniformity of 2D Data ---
#>  without Convex Hull Correction
#>
#> data:  Xp
#> standardized domination number (i.e., Z) = 5.7689, p-value = 7.977e-09
#> alternative hypothesis: true expected domination number is not equal to 143.8122
#> 95 percent confidence interval:
#>  186.0451 209.9549
#> sample estimates:
#>          domination number   || Pr(domination number = 3)
#>                  198.0000000                    0.9226804 

In all these tests, convex hull correction for the \(\mathcal{X}\) points falling outside the convex hull of \(\mathcal{Y}\) points can be implemented with ch.cor option, with ch.cor=TRUE implementing the correction, default is ch.cor=FALSE. Furthermore, we only provide the two-sided tests above, although both one-sided versions are also available.

1.3 Summary and Visualization with Central Similarity PCDs

Number of arcs of the CS-PCD can be computed by the function num.arcsCS, see Section “VS1_1_2DArtiData”, for details. Only the number of arcs is presented, the other parts of the output are suppressed for brevity, to see them, just use arcs in the last line:

M<-c(1,1,1) #try also M<-c(1,2,3)
tau<-1.5 #try also tau<-2, and tau=.5

Narcs=num.arcsCS(Xp,Yp,tau,M)
Narcs$num.arcs
#> [1] 955

#summary(Narcs)
#plot(Narcs)

As in the AS-PCD and PE-PCD cases above

1.3.1 Testing Spatial Interaction with the CS-PCDs

We can test the spatial pattern or interaction of segregation/association based on arc density of CS-PCD using the function CSarc.dens.test (see Section “VS1_1_2DArtiData”, for details):

CSarc.dens.test(Xp,Yp,tau) #try also CSarc.dens.test(Xp,Yp,tau,alt="l")

#>  Large Sample z-Test Based on Arc Density of CS-PCD for Testing Uniformity of 2D Data ---
#>  without Convex Hull Correction
#>
#> data:  Xp
#> standardized arc density (i.e., Z) = -1.7446, p-value = 0.08106
#> alternative hypothesis: true (expected) arc density is not equal to 0.003837374
#> 95 percent confidence interval:
#>  0.002373249 0.003922496
#> sample estimates:
#> arc density
#> 0.003147873  

Here, convex hull correction for \(\mathcal{X}\) points outside the convex hull of \(\mathcal{Y}\) points can be implemented with ch.cor option, with ch.cor=TRUE implementing the correction, default is ch.cor=FALSE. For example, CSarc.dens.test(Xp,Yp,tau,ch=TRUE) would yield the convex hull corrected version of the arc-based test of spatial interaction. Furthermore, we only provide the two-sided test below, although both one-sided versions are also available. See also Ceyhan, Priebe, and Marchette (2007) and Ceyhan (2014) for more detail.

References

Ceyhan, E. 2014. “Comparison of Relative Density of Two Random Geometric Digraph Families in Testing Spatial Clustering.” TEST 23(1): 100–134.
Ceyhan, E., C. E. Priebe, and D. J. Marchette. 2007. “A New Family of Random Graphs for Testing Spatial Segregation.” Canadian Journal of Statistics 35(1): 27–50.
Dixon, P. M. 2002. “Nearest-Neighbor Contingency Table Analysis of Spatial Segregation for Several Species.” Ecoscience 9(2): 142–51.
Good, B. J., and S. A. Whipple. 1982. “Tree Spatial Patterns: South Carolina Bottomland and Swamp Forests.” Bulletin of the Torrey Botanical Club 109(4): 529–36.