Amostragem

Amostragem Casual Simples

O objetivo deste exemplo é realizar uma amostragem casual simples de uma área, de 46,8 ha com um erro de 20%. Foram lançadas 10 parcelas de 3000 m² para um inventário piloto. Os dados são o seguinte:

dados_acs_piloto
#> # A tibble: 10 × 4
#>   TOTAL_AREA PLOT_AREA   VWB VWB_m3ha
#>        <dbl>     <int> <int>    <dbl>
#> 1       46.8      3000    41    137. 
#> 2       46.8      3000    33    110  
#> 3       46.8      3000    24     80  
#> 4       46.8      3000    31    103. 
#> 5       46.8      3000    10     33.3
#> 6       46.8      3000    32    107. 
#> # ℹ 4 more rows

Agora, rodamos o inventário piloto considerando um erro de 20% e a população como finita com a função sprs. Lembrando que os valores de área da parcela devem ser inseridos em metros quadrados, e os valores de área total devem ser inseridos em hectare:

sprs(dados_acs_piloto, "VWB", 3000, 46.8,error = 20, pop = "fin")
#>                                        Variables    Values
#> 1              Total number of sampled plots (n)   10.0000
#> 2                    Number of maximum plots (N)  156.0000
#> 3                      Variance Quoeficient (VC)   53.2670
#> 4                                      t-student    2.2622
#> 5                         recalculated t-student    2.0452
#> 6  Number of samples regarding the admited error   25.0000
#> 7                                  Variance (S2)  328.0000
#> 8                         Standard deviation (s)   18.1108
#> 9                                       Mean (Y)   34.0000
#> 10               Standard error of the mean (Sy)    5.5405
#> 11                                Absolute Error   12.5335
#> 12                            Relative Error (%)   36.8634
#> 13                  Estimated Total Value (Yhat) 5304.0000
#> 14                                   Total Error 1955.2326
#> 15             Inferior Confidence Interval (m3)   21.4665
#> 16             Superior Confidence Interval (m3)   46.5335
#> 17          Inferior Confidence Interval (m3/ha)   71.5549
#> 18          Superior Confidence Interval (m3/ha)  155.1118
#> 19       inferior Total Confidence Interval (m3) 3348.7674
#> 20       Superior Total Confidence Interval (m3) 7259.2326

Com esses resultados, percebe-se que para atingirmos o erro desejado, precisaremos de mais 15 amostras. Após uma nova amostragem, os novos dados são os seguintes:

dados_acs_def
#> # A tibble: 25 × 3
#>   TOTAL_AREA PLOT_AREA   VWB
#>        <dbl>     <int> <int>
#> 1       46.8      3000    41
#> 2       46.8      3000    33
#> 3       46.8      3000    24
#> 4       46.8      3000    31
#> 5       46.8      3000    10
#> 6       46.8      3000    32
#> # ℹ 19 more rows

Agora o inventário definitivo será realizado, com 20% de erro e considerando uma população finita:

sprs(dados_acs_def, "VWB", 3000, 46.8, error = 20, pop = "fin")
#>                                        Variables    Values
#> 1              Total number of sampled plots (n)   25.0000
#> 2                    Number of maximum plots (N)  156.0000
#> 3                      Variance Quoeficient (VC)   45.4600
#> 4                                      t-student    2.0639
#> 5                         recalculated t-student    2.0930
#> 6  Number of samples regarding the admited error   20.0000
#> 7                                  Variance (S2)  226.6933
#> 8                         Standard deviation (s)   15.0563
#> 9                                       Mean (Y)   33.1200
#> 10               Standard error of the mean (Sy)    2.7595
#> 11                                Absolute Error    5.6952
#> 12                            Relative Error (%)   17.1957
#> 13                  Estimated Total Value (Yhat) 5166.7200
#> 14                                   Total Error  888.4555
#> 15             Inferior Confidence Interval (m3)   27.4248
#> 16             Superior Confidence Interval (m3)   38.8152
#> 17          Inferior Confidence Interval (m3/ha)   91.4159
#> 18          Superior Confidence Interval (m3/ha)  129.3841
#> 19       inferior Total Confidence Interval (m3) 4278.2645
#> 20       Superior Total Confidence Interval (m3) 6055.1755

O erro foi atingido.

Os valores de área podem ser inseridos também como colunas:

sprs(dados_acs_def, "VWB", "PLOT_AREA", "TOTAL_AREA", 
     error = 20, pop = "fin")
#>                                        Variables    Values
#> 1              Total number of sampled plots (n)   25.0000
#> 2                    Number of maximum plots (N)  156.0000
#> 3                      Variance Quoeficient (VC)   45.4600
#> 4                                      t-student    2.0639
#> 5                         recalculated t-student    2.0930
#> 6  Number of samples regarding the admited error   20.0000
#> 7                                  Variance (S2)  226.6933
#> 8                         Standard deviation (s)   15.0563
#> 9                                       Mean (Y)   33.1200
#> 10               Standard error of the mean (Sy)    2.7595
#> 11                                Absolute Error    5.6952
#> 12                            Relative Error (%)   17.1957
#> 13                  Estimated Total Value (Yhat) 5166.7200
#> 14                                   Total Error  888.4555
#> 15             Inferior Confidence Interval (m3)   27.4248
#> 16             Superior Confidence Interval (m3)   38.8152
#> 17          Inferior Confidence Interval (m3/ha)   91.4159
#> 18          Superior Confidence Interval (m3/ha)  129.3841
#> 19       inferior Total Confidence Interval (m3) 4278.2645
#> 20       Superior Total Confidence Interval (m3) 6055.1755

Também é possível realizar vários inventários casuais simples. Para demonstrar isso vamos utilizar o dado de exemplo para inventário estratificado, porém, vamos utilizar as estatísticas de acs. Utilizamos a função sprs, e indicamos a variável de grupo no argumento .groups. Neste caso, como temos várias áreas, a área total deve ser informada como uma coluna:

sprs(dados_ace_def, "VWB", "PLOT_AREA", "STRATA_AREA",
     .groups = "STRATA" ,error = 20, pop = "fin")
#>                                        Variables  STRATA1   STRATA2   STRATA3
#> 1              Total number of sampled plots (n)  14.0000   20.0000   23.0000
#> 2                    Number of maximum plots (N) 144.0000  164.0000  142.0000
#> 3                      Variance Quoeficient (VC)  24.4785   15.8269   16.7813
#> 4                                      t-student   2.1604    2.0930    2.0739
#> 5                         recalculated t-student   2.4469    4.3027    4.3027
#> 6  Number of samples regarding the admited error   9.0000   11.0000   12.0000
#> 7                                  Variance (S2)   2.1829    3.6161    5.3192
#> 8                         Standard deviation (s)   1.4774    1.9016    2.3063
#> 9                                       Mean (Y)   6.0357   12.0150   13.7435
#> 10               Standard error of the mean (Sy)   0.3752    0.3984    0.4402
#> 11                                Absolute Error   0.8105    0.8339    0.9130
#> 12                            Relative Error (%)  13.4288    6.9409    6.6431
#> 13                  Estimated Total Value (Yhat) 869.1429 1970.4600 1951.5739
#> 14                                   Total Error 116.7157  136.7670  129.6455
#> 15             Inferior Confidence Interval (m3)   5.2252   11.1811   12.8305
#> 16             Superior Confidence Interval (m3)   6.8462   12.8489   14.6565
#> 17          Inferior Confidence Interval (m3/ha)  52.2519  111.8105  128.3048
#> 18          Superior Confidence Interval (m3/ha)  68.4624  128.4895  146.5647
#> 19       inferior Total Confidence Interval (m3) 752.4271 1833.6930 1821.9284
#> 20       Superior Total Confidence Interval (m3) 985.8586 2107.2270 2081.2194

Amostragem Casual Estratificada

O objetivo deste exemplo é realizar uma amostragem casual sistemática de uma área, com um erro de 5%. A área foi dividida em 3 estratos: 1 com 14,4 ha e 7 parcelas, um com 16,4 ha e 8 parcelas, e outro com 14,2 ha e 7 parcelas. As parcelas tem uma área de 1000 m². Ao total foram 22 parcelas para o inventário piloto. Os dados são o seguinte:

dados_ace_piloto
#> # A tibble: 22 × 5
#>   STRATA STRATA_AREA PLOT_AREA   VWB VWB_m3ha
#>    <int>       <dbl>     <int> <dbl>    <dbl>
#> 1      1        14.4      1000  7.9      79  
#> 2      1        14.4      1000  3.8      38  
#> 3      1        14.4      1000  4.4      44  
#> 4      1        14.4      1000  6.25     62.5
#> 5      1        14.4      1000  5.55     55.5
#> 6      1        14.4      1000  8.1      81  
#> # ℹ 16 more rows

Agora realizamos o inventário com um um erro almejado de 5%, considerando a população como finita utilizando a função strs. Os valores de área podem ser inseridos como números, ou como variáveis. No caso da área dos estratos, um vetor numérico pode ser utilizado. A área da parcela deve ser inserida em metros quadrados, e a área dos estratos em hectares:

strs(dados_ace_piloto, "VWB", 3000, c(14.4, 16.4, 14.2), 
     strata = "STRATA", error = 5, pop = "fin")
#> $Table1
#>                                             Variables  STRATA 1  STRATA 2
#> 1                                         STRATA_AREA   14.4000   16.4000
#> 2                                           Plot Area 3000.0000 3000.0000
#> 3            Number of sampled plots per stratum (nj)    7.0000    8.0000
#> 4                   Total number of sampled plots (n)   22.0000   22.0000
#> 5            Number of maximum plots per stratum (Nj)   48.0000   54.6667
#> 6                         Number of maximum plots (N)  150.0000  150.0000
#> 7                                     Nj/N Ratio (Pj)    0.3200    0.3644
#> 8                                   Stratum sum (Eyj)   42.1000   98.2500
#> 9                        Stratum quadratic sum (Eyj2)  268.8950 1237.2275
#> 10                        Mean of Yi per stratum (Yj)    6.0143   12.2812
#> 11                                              PjSj2    0.8370    1.5929
#> 12                                               PjSj    0.5175    0.7619
#> 13                                               PjYj    1.9246    4.4758
#> 14                                          t-student    2.0796    2.0796
#> 15                             recalculated t-student    2.0129    2.0129
#> 16      Number of samples regarding the admited error   45.0000   45.0000
#> 17 Optimal number of samples per stratum (nj optimal)   11.0000   16.0000
#> 18              Optimal number of samples (n optimal)   46.0000   46.0000
#> 19               Total value of Y per stratum (Yhatj)  288.6857  671.3750
#>     STRATA 3
#> 1    14.2000
#> 2  3000.0000
#> 3     7.0000
#> 4    22.0000
#> 5    47.3333
#> 6   150.0000
#> 7     0.3156
#> 8    96.1000
#> 9  1365.5500
#> 10   13.7286
#> 11    2.4316
#> 12    0.8760
#> 13    4.3321
#> 14    2.0796
#> 15    2.0129
#> 16   45.0000
#> 17   19.0000
#> 18   46.0000
#> 19  649.8190
#> 
#> $Table2
#>                                  Variables     value
#> 1                                t-student    2.0796
#> 2          Standard error of the mean (Sy)    0.4228
#> 3                      Stratified Variance    4.8614
#> 4            Stratified Standard Deviation    2.1554
#> 5                Variance Quoeficient (VC)   20.0829
#> 6                      Stratified Mean (Y)   10.7325
#> 7                           Absolute Error    0.8793
#> 8                       Relative Error (%)    8.1925
#> 9             Estimated Total Value (Yhat) 1609.8798
#> 10                             Total Error  131.8894
#> 11       Inferior Confidence Interval (m3)    9.8533
#> 12       Superior Confidence Interval (m3)   11.6118
#> 13    Inferior Confidence Interval (m3/ha)   32.8442
#> 14    Superior Confidence Interval (m3/ha)   38.7060
#> 15 inferior Total Confidence Interval (m3) 1477.9904
#> 16 Superior Total Confidence Interval (m3) 1741.7691

Analisando a tabela 1, vemos que para atingir o erro de 5%, precisaremos amostrar mais 24 parcelas. 4 no estrato 1, 8 no estrato 2, e 12 no estrato 3.

Com as parcelas lançadas, os dados para o inventário definitivo são esses:

dados_ace_def
#> # A tibble: 57 × 5
#>   STRATA STRATA_AREA PLOT_AREA   VWB VWB_m3ha
#>    <int>       <dbl>     <int> <dbl>    <dbl>
#> 1      1        14.4      1000  7.9      79  
#> 2      1        14.4      1000  3.8      38  
#> 3      1        14.4      1000  4.4      44  
#> 4      1        14.4      1000  6.25     62.5
#> 5      1        14.4      1000  5.55     55.5
#> 6      1        14.4      1000  8.1      81  
#> # ℹ 51 more rows

Agora, realizamos o inventário novamente, dessa vez para os dados definitivos. Consideramos novamente um erro de 5% e a população como finita:

strs(dados_ace_def, "VWB", "PLOT_AREA", "STRATA_AREA", 
     strata = "STRATA", error = 5, pop = "fin")
#> $Table1
#>                                             Variables  STRATA 1  STRATA 2
#> 1                                         STRATA_AREA   14.4000   16.4000
#> 2                                           Plot Area 1000.0000 1000.0000
#> 3            Number of sampled plots per stratum (nj)   14.0000   20.0000
#> 4                   Total number of sampled plots (n)   57.0000   57.0000
#> 5            Number of maximum plots per stratum (Nj)  144.0000  164.0000
#> 6                         Number of maximum plots (N)  450.0000  450.0000
#> 7                                     Nj/N Ratio (Pj)    0.3200    0.3644
#> 8                                   Stratum sum (Eyj)   84.5000  240.3000
#> 9                        Stratum quadratic sum (Eyj2)  538.3950 2955.9100
#> 10                        Mean of Yi per stratum (Yj)    6.0357   12.0150
#> 11                                              PjSj2    0.6985    1.3179
#> 12                                               PjSj    0.4728    0.6930
#> 13                                               PjYj    1.9314    4.3788
#> 14                                          t-student    2.0032    2.0032
#> 15                             recalculated t-student    2.0141    2.0141
#> 16      Number of samples regarding the admited error   46.0000   46.0000
#> 17 Optimal number of samples per stratum (nj optimal)   12.0000   17.0000
#> 18              Optimal number of samples (n optimal)   47.0000   47.0000
#> 19               Total value of Y per stratum (Yhatj)  869.1429 1970.4600
#>     STRATA 3
#> 1    14.2000
#> 2  1000.0000
#> 3    23.0000
#> 4    57.0000
#> 5   142.0000
#> 6   450.0000
#> 7     0.3156
#> 8   316.1000
#> 9  4461.3350
#> 10   13.7435
#> 11    1.6785
#> 12    0.7278
#> 13    4.3368
#> 14    2.0032
#> 15    2.0141
#> 16   46.0000
#> 17   18.0000
#> 18   47.0000
#> 19 1951.5739
#> 
#> $Table2
#>                                  Variables     value
#> 1                                t-student    2.0032
#> 2          Standard error of the mean (Sy)    0.2339
#> 3                      Stratified Variance    3.6949
#> 4            Stratified Standard Deviation    1.8936
#> 5                Variance Quoeficient (VC)   17.7851
#> 6                      Stratified Mean (Y)   10.6471
#> 7                           Absolute Error    0.4685
#> 8                       Relative Error (%)    4.4003
#> 9             Estimated Total Value (Yhat) 4791.1768
#> 10                             Total Error  210.8250
#> 11       Inferior Confidence Interval (m3)   10.1786
#> 12       Superior Confidence Interval (m3)   11.1156
#> 13    Inferior Confidence Interval (m3/ha)  101.7856
#> 14    Superior Confidence Interval (m3/ha)  111.1556
#> 15 inferior Total Confidence Interval (m3) 4580.3518
#> 16 Superior Total Confidence Interval (m3) 5002.0018

O erro desejado foi atingido.

Amostragem Sistemática

Agora iremos amostrar uma área de 18 hectares em que 18 parcelas de 200 metros quadrados foram lançadas sistematicamente:

dados_as
#> # A tibble: 18 × 4
#>   TOTAL_AREA PLOT_AREA   VWB VWB_m3ha
#>        <dbl>     <int> <int>    <dbl>
#> 1         10       200     6      300
#> 2         10       200     8      400
#> 3         10       200     9      450
#> 4         10       200    10      500
#> 5         10       200    13      650
#> 6         10       200    12      600
#> # ℹ 12 more rows

Primeiro, vamos ver qual seria o erro atingido, pelo método da amostragem casual simples:

sprs(dados_as, "VWB", 200, 18)
#>                                        Variables     Values
#> 1              Total number of sampled plots (n)    18.0000
#> 2                    Number of maximum plots (N)   900.0000
#> 3                      Variance Quoeficient (VC)    44.6505
#> 4                                      t-student     2.1098
#> 5                         recalculated t-student     1.9873
#> 6  Number of samples regarding the admited error    79.0000
#> 7                                  Variance (S2)    81.9771
#> 8                         Standard deviation (s)     9.0541
#> 9                                       Mean (Y)    20.2778
#> 10               Standard error of the mean (Sy)     2.1341
#> 11                                Absolute Error     4.5025
#> 12                            Relative Error (%)    22.2042
#> 13                  Estimated Total Value (Yhat) 18250.0000
#> 14                                   Total Error  4052.2580
#> 15             Inferior Confidence Interval (m3)    15.7753
#> 16             Superior Confidence Interval (m3)    24.7803
#> 17          Inferior Confidence Interval (m3/ha)   788.7634
#> 18          Superior Confidence Interval (m3/ha)  1239.0143
#> 19       inferior Total Confidence Interval (m3) 14197.7420
#> 20       Superior Total Confidence Interval (m3) 22302.2580

O erro obtido foi de 22,2%. Agora iremos calcular o erro utilizando o método das diferenças sucessivas com a função ss_diffs. Lembrando que os dados devem ser inseridos na ordem de medição, valores de área da parcela devem ser inseridos em metros quadrados, e valores de área total, em hectares.

ss_diffs(dados_as, "VWB", 200, 18)
#>                                        Variables     Values
#> 1              Total number of sampled plots (n)    18.0000
#> 2                    Number of maximum plots (N)   900.0000
#> 3                      Variance Quoeficient (VC)    44.6505
#> 4                                      t-student     2.1098
#> 5                         recalculated t-student     1.9873
#> 6  Number of samples regarding the admited error    79.0000
#> 7                                  Variance (S2)    81.9771
#> 8                         Standard deviation (S)     9.0541
#> 9                                       Mean (Y)    20.2778
#> 10               Standard error of the mean (Sy)     0.4041
#> 11                                Absolute Error     0.8527
#> 12                            Relative Error (%)     4.2050
#> 13                  Estimated Total Value (Yhat) 18250.0000
#> 14                                   Total Error   767.4046
#> 15             Inferior Confidence Interval (m3)    19.4251
#> 16             Superior Confidence Interval (m3)    21.1304
#> 17          Inferior Confidence Interval (m3/ha)   971.2553
#> 18          Superior Confidence Interval (m3/ha)  1056.5225
#> 19       inferior Total Confidence Interval (m3) 17482.5954
#> 20       Superior Total Confidence Interval (m3) 19017.4046

O erro obtido foi de 4,2%. Houve uma redução significativa no erro.