Pooling Columns

Carl James Schwarz

2024-01-25

1 Why it is not possible to compare different column poolings.

There are two issues that currently preclude a direct implementation of logical pooling for columns.

First, unlike pooling rows where physical pooling is equivalent to equating the initial tagging probabilities, there is no equivalent rule for recovery probabilities for column pooling. In the current model, the expected values for cells counts are (Schwarz and Taylor) in the 2 x 3 case (i.e. \(s \le t\)):

Releases Recovery stratum 1 Recovery stratum 2 Recovery stratum 3
Release stratum 1 \(p_1 \theta_{11} r_1\) \(p_1 \theta_{12} r_2\) \(p_1 \theta_{13} r_3\)
Release stratum 2 \(p_2 \theta_{21} r_1\) \(p_2 \theta_{22} r_2\) \(p_2 \theta_{23} r_3\)
Unmarked \(\sum{(1-p_i)\theta_{i1}r_1}\) \(\sum{(1-p_i)\theta_{i2}r_2}\) \(\sum{(1-p_i)\theta_{i3}r_3}\)

where \(p_i\) is the tagging probability in release stratum \(i\); \(r_j\) is the recovery probability in recovery stratum \(j\); and \(\theta_{ij}\) is the probability of moving from release stratum \(i\) to recovery stratum \(j\). In the case of \(s < t\), the recovery probabilities are not separately identifiable because in any column, we can multiply \(\theta_{ij}\) by a constant \(k\) and divide \(r_j\) by \(k\) and get exactly the same expected values. Hence one can always force \(r_i = r_j\) for any (\(i\) and \(j\)) pair by appropriate choice of \(k\) values for each column.

Second, in theory, you can always pool columns and NOT affect the fit. As an analogy, the SPAS model is (to the first order approximation) a regression problem, i.e. given a matrix of recoveries and data

Stratum Recovery stratum 1 Recovery stratum 2 Recovery stratum 3
Release stratum 1 \(m_{11}\) \(m_{12}\) \(m_{13}\)
Release stratum 2 \(m_{21}\) \(m_{22}\) \(m_{23}\)
Unmarked \(u_1\) \(u_2\) \(u_2\)

You want to find estimates of \(\beta_1\) and \(\beta_2\) such that \[u_1≅\beta_1 m_{11}+\beta_2 m_{21}\] \[u_2≅\beta_1 m_{12}+\beta_2 m_{22}\] \[u_3≅\beta_1 m_{13}+\beta_2 m_{23}\]

Pooling columns 2 and 3 reduces this system of equations to: \[u_1≅\beta_1 m_{11}+\beta_2 m_{21}\] \[u_2+u_3≅\beta_1 (m_{12}+m_{13})+\beta_2 (m_{22} +m_{23})\] which has the identical fit.

Like the regression analogy, pooling columns is like pooling two data points in a regression setting by adding the respective \(X\) and \(Y\) values. You will very similar same regression estimates, particularly if you do a weighted regression to account for the doubling of the variation when you add to points together.

So in theory, pooling columns should have negligible effect on the estimates of the population size. So at the moment, I doubt that it is possible to implement logical pooling of columns and compare column pooling using AIC. It would be possible to implement logical pooling of columns but it solves an uninteresting problem of testing if the product of movement and recovery are identical for all release groups in the two columns which is seldom biologically plausible.

2 A sample dataset

This sample data set was adopted from the Canadian Department of Fisheries and Oceans and represent release and recaptured of female fish in the Lower Shuswap region.


test.data.csv <- textConnection("
 160   ,   127   ,     72   ,     82   ,   3592
  24   ,    66   ,     13   ,     10   ,    532
7960   ,  9720   ,   6264   ,   7934   ,   0  ")

test.data <- as.matrix(read.csv(test.data.csv, header=FALSE, strip.white=TRUE))
test.data
#>        V1   V2   V3   V4   V5
#> [1,]  160  127   72   82 3592
#> [2,]   24   66   13   10  532
#> [3,] 7960 9720 6264 7934    0

3 Models with 2 rows and different number of columns

We now fit two models examining the impact of pooling columns with different types of pooling rows.

3.1 Full 2x4 stratified analysis

library(SPAS)
mod..1 <- SPAS.fit.model(test.data,
                       model.id="No restrictions",
                       row.pool.in=1:2, col.pool.in=1:4)
#> Using nlminb to find conditional MLE
#> outer mgc:  25668.84 
#> outer mgc:  31226.96 
#> outer mgc:  27658.72 
#> outer mgc:  8924.818 
#> outer mgc:  6133.671 
#> outer mgc:  743.5753 
#> outer mgc:  27.31259 
#> outer mgc:  0.4124965 
#> outer mgc:  0.08327987 
#> outer mgc:  0.03053576 
#> Convergence codes from nlminb  0 relative convergence (4) 
#> Finding conditional estimate of N

SPAS.print.model(mod..1)
#> Model Name: No restrictions 
#>    Date of Fit: 2024-01-25 12:19 
#>    Version of OPEN SPAS used : SPAS-R 2023-03-31 
#>  
#> Raw data 
#>        V1   V2   V3   V4   V5
#> [1,]  160  127   72   82 3592
#> [2,]   24   66   13   10  532
#> [3,] 7960 9720 6264 7934    0
#> 
#> Row pooling setup : 1 2 
#> Col pooling setup : 1 2 3 4 
#> Physical pooling  : TRUE 
#> Theta pooling     : FALSE 
#> CJS pooling       : FALSE 
#> 
#> 
#> Chapman estimator of population size  273430  (SE  10793  )
#>  
#> 
#> Raw data AFTER PHYSICAL (but not logical) POOLING 
#>       pool1 pool2 pool3 pool4   V5
#> pool1   160   127    72    82 3592
#> pool2    24    66    13    10  532
#>        7960  9720  6264  7934    0
#> 
#> Condition number of XX' where X= (physically) pooled matrix is  45.64022 
#> Condition number of XX' after logical pooling                   45.64022 
#> 
#> Large value of kappa (>1000) indicate that rows are approximately proportional which is not good
#> 
#>   Conditional   Log-Likelihood: 285428.4    ;  np: 12 ;  AICc: -570832.8 
#> 
#>   Code/Message from optimization is:  0 relative convergence (4) 
#> 
#> Estimates
#>               pool1  pool2  pool3  pool4  psi cap.prob exp factor Pop Est
#> pool1         110.8  134.4   86.5  109.4 3592    0.014       72.3  295561
#> pool2          24.0   66.0   13.0   10.0  532    1.000        0.0     645
#> est unmarked 8009.0 9713.0 6250.0 7907.0    0       NA         NA  296206
#> 
#> SE of above estimates
#>              pool1 pool2 pool3 pool4  psi cap.prob exp factor Pop Est
#> pool1          5.4   6.5   4.2   5.3 59.9    0.001        3.5   13978
#> pool2          4.9   8.1   3.6   3.2 23.1    0.000        0.0       0
#> est unmarked    NA    NA    NA    NA  0.0       NA         NA   13192
#> 
#> 
#> Chisquare gof cutoff  : 0.1 
#> Chisquare gof value   : 31.95852 
#> Chisquare gof df      : 2 
#> Chisquare gof p       : 1.148937e-07

3.2 Reducing to a 2x3 matrix by pooling columns 3 and 4.

mod..2 <- SPAS.fit.model(test.data,
                           model.id="Pool last two columns",
                           row.pool.in=c(1,2), col.pool.in=c(1,2,34,34))
#> Using nlminb to find conditional MLE
#> outer mgc:  25647.14 
#> outer mgc:  30979.46 
#> outer mgc:  26430.02 
#> outer mgc:  12526.52 
#> outer mgc:  4361.936 
#> outer mgc:  373.9823 
#> outer mgc:  76.32601 
#> outer mgc:  13.19575 
#> outer mgc:  0.437045 
#> outer mgc:  0.04367686 
#> outer mgc:  0.01676388 
#> Convergence codes from nlminb  0 relative convergence (4) 
#> Finding conditional estimate of N

SPAS.print.model(mod..2)
#> Model Name: Pool last two columns 
#>    Date of Fit: 2024-01-25 12:19 
#>    Version of OPEN SPAS used : SPAS-R 2023-03-31 
#>  
#> Raw data 
#>        V1   V2   V3   V4   V5
#> [1,]  160  127   72   82 3592
#> [2,]   24   66   13   10  532
#> [3,] 7960 9720 6264 7934    0
#> 
#> Row pooling setup : 1 2 
#> Col pooling setup : 1 2 34 34 
#> Physical pooling  : TRUE 
#> Theta pooling     : FALSE 
#> CJS pooling       : FALSE 
#> 
#> 
#> Chapman estimator of population size  273430  (SE  10793  )
#>  
#> 
#> Raw data AFTER PHYSICAL (but not logical) POOLING 
#>       pool1 pool2 pool34   V5
#> pool1   160   127    154 3592
#> pool2    24    66     23  532
#>        7960  9720  14198    0
#> 
#> Condition number of XX' where X= (physically) pooled matrix is  50.91806 
#> Condition number of XX' after logical pooling                   50.91806 
#> 
#> Large value of kappa (>1000) indicate that rows are approximately proportional which is not good
#> 
#>   Conditional   Log-Likelihood: 295293.6    ;  np: 10 ;  AICc: -590567.3 
#> 
#>   Code/Message from optimization is:  0 relative convergence (4) 
#> 
#> Estimates
#>               pool1  pool2  pool34  psi cap.prob exp factor Pop Est
#> pool1         110.8  134.4   195.8 3592    0.014       72.3  295561
#> pool2          24.0   66.0    23.0  532    1.000        0.0     645
#> est unmarked 8009.0 9713.0 14156.0    0       NA         NA  296206
#> 
#> SE of above estimates
#>              pool1 pool2 pool34  psi cap.prob exp factor Pop Est
#> pool1          5.4   6.5    9.4 59.9    0.001        3.5   13978
#> pool2          4.9   8.1    4.8 23.1    0.000        0.0       0
#> est unmarked    NA    NA     NA  0.0       NA         NA   13192
#> 
#> 
#> Chisquare gof cutoff  : 0.1 
#> Chisquare gof value   : 31.62033 
#> Chisquare gof df      : 1 
#> Chisquare gof p       : 1.874568e-08

3.3 Reducing to a 2x2 matrix by pooling columns 1 and 2, and then 3 and 4.

mod..3 <- SPAS.fit.model(test.data,
                           model.id="Pool last two columns",
                           row.pool.in=c(1,2), col.pool.in=c(12,22,34,34))
#> Using nlminb to find conditional MLE
#> outer mgc:  25647.14 
#> outer mgc:  30979.46 
#> outer mgc:  26430.02 
#> outer mgc:  12526.52 
#> outer mgc:  4361.936 
#> outer mgc:  373.9823 
#> outer mgc:  76.32601 
#> outer mgc:  13.19575 
#> outer mgc:  0.437045 
#> outer mgc:  0.04367686 
#> outer mgc:  0.01676388 
#> Convergence codes from nlminb  0 relative convergence (4) 
#> Finding conditional estimate of N

SPAS.print.model(mod..3)
#> Model Name: Pool last two columns 
#>    Date of Fit: 2024-01-25 12:19 
#>    Version of OPEN SPAS used : SPAS-R 2023-03-31 
#>  
#> Raw data 
#>        V1   V2   V3   V4   V5
#> [1,]  160  127   72   82 3592
#> [2,]   24   66   13   10  532
#> [3,] 7960 9720 6264 7934    0
#> 
#> Row pooling setup : 1 2 
#> Col pooling setup : 12 22 34 34 
#> Physical pooling  : TRUE 
#> Theta pooling     : FALSE 
#> CJS pooling       : FALSE 
#> 
#> 
#> Chapman estimator of population size  273430  (SE  10793  )
#>  
#> 
#> Raw data AFTER PHYSICAL (but not logical) POOLING 
#>       pool12 pool22 pool34   V5
#> pool1    160    127    154 3592
#> pool2     24     66     23  532
#>         7960   9720  14198    0
#> 
#> Condition number of XX' where X= (physically) pooled matrix is  50.91806 
#> Condition number of XX' after logical pooling                   50.91806 
#> 
#> Large value of kappa (>1000) indicate that rows are approximately proportional which is not good
#> 
#>   Conditional   Log-Likelihood: 295293.6    ;  np: 10 ;  AICc: -590567.3 
#> 
#>   Code/Message from optimization is:  0 relative convergence (4) 
#> 
#> Estimates
#>              pool12 pool22  pool34  psi cap.prob exp factor Pop Est
#> pool1         110.8  134.4   195.8 3592    0.014       72.3  295561
#> pool2          24.0   66.0    23.0  532    1.000        0.0     645
#> est unmarked 8009.0 9713.0 14156.0    0       NA         NA  296206
#> 
#> SE of above estimates
#>              pool12 pool22 pool34  psi cap.prob exp factor Pop Est
#> pool1           5.4    6.5    9.4 59.9    0.001        3.5   13978
#> pool2           4.9    8.1    4.8 23.1    0.000        0.0       0
#> est unmarked     NA     NA     NA  0.0       NA         NA   13192
#> 
#> 
#> Chisquare gof cutoff  : 0.1 
#> Chisquare gof value   : 31.62033 
#> Chisquare gof df      : 1 
#> Chisquare gof p       : 1.874568e-08

3.4 Comparing the estimates

Notice that the population estimate and its standard error is identical to the unpooled case. You cannot compare the AICc values because the data set has changed between the two fits.

#>      .id       date                      model.id s.a.pool t.p.pool logL.cond
#> 1 mod..1 2024-01-25               No restrictions        2        4 285428.40
#> 2 mod..2 2024-01-25         Pool last two columns        2        3 295293.64
#> 3 mod..3 2024-01-25         Pool last two columns        2        3 295293.64
#> 4 mod..4 2024-01-25                Pooled Peteren        1        1  60410.94
#> 5 mod..5 2024-01-25     Logical Pooling some rows        6        5  47585.39
#> 6 mod..6 2024-01-25   A single row - Logical Pool        6        5  47580.39
#> 7 mod..7 2024-01-25 Pooled Peteren - Logical Pool        6        1  56584.83
#> 8 mod..8 2024-01-25    Logical Pooling pairs rows        6        5  47584.73
#>   np       AICc gof.chisq gof.df gof.p   Nhat Nhat.se
#> 1 12 -570832.81      32.0      2 0.000 296206   13192
#> 2 10 -590567.29      31.6      1 0.000 296206   13192
#> 3 10 -590567.29      31.6      1 0.000 296206   13192
#> 4  3 -120815.88       0.0      0    NA  70426    4545
#> 5 40  -95090.77       2.8      1 0.096  73440   10424
#> 6 37  -95086.79      13.1      4 0.011  70426    4545
#> 7 13 -113143.67       0.0      0    NA  70426    4545
#> 8 39  -95091.46       3.8      2 0.147  83198   13337

4 Comparing results with 1 row and different number of columns

4.1 Pooling over all rows using physical pooling

mod..3 <- SPAS.fit.model(test.data,
                           model.id="Physical pooling to single row",
                           row.pool.in=c(1,1), col.pool.in=1:4)
#> Using nlminb to find conditional MLE
#> outer mgc:  31867.55 
#> outer mgc:  31249.72 
#> outer mgc:  28928.32 
#> outer mgc:  14737.62 
#> outer mgc:  2903.05 
#> outer mgc:  230.5378 
#> outer mgc:  8.523751 
#> outer mgc:  0.1178486 
#> outer mgc:  4.01421e-05 
#> Convergence codes from nlminb  0 both X-convergence and relative convergence (5) 
#> Finding conditional estimate of N
SPAS.print.model(mod..3)
#> Model Name: Physical pooling to single row 
#>    Date of Fit: 2024-01-25 12:19 
#>    Version of OPEN SPAS used : SPAS-R 2023-03-31 
#>  
#> Raw data 
#>        V1   V2   V3   V4   V5
#> [1,]  160  127   72   82 3592
#> [2,]   24   66   13   10  532
#> [3,] 7960 9720 6264 7934    0
#> 
#> Row pooling setup : 1 1 
#> Col pooling setup : 1 2 3 4 
#> Physical pooling  : TRUE 
#> Theta pooling     : FALSE 
#> CJS pooling       : FALSE 
#> 
#> 
#> Chapman estimator of population size  273430  (SE  10793  )
#>  
#> 
#> Raw data AFTER PHYSICAL (but not logical) POOLING 
#>   pool1 pool2 pool3 pool4   V5
#> 1   184   193    85    92 4124
#>    7960  9720  6264  7934    0
#> 
#> Condition number of XX' where X= (physically) pooled matrix is  1 
#> Condition number of XX' after logical pooling                   1 
#> 
#> Large value of kappa (>1000) indicate that rows are approximately proportional which is not good
#> 
#>   Conditional   Log-Likelihood: 287272.7    ;  np: 6 ;  AICc: -574533.3 
#> 
#>   Code/Message from optimization is:  0 both X-convergence and relative convergence (5) 
#> 
#> Estimates
#>               pool1  pool2  pool3  pool4  psi cap.prob exp factor Pop Est
#> 1             139.1  169.3  108.5  137.1 4124    0.017       57.5  273857
#> est unmarked 8005.0 9744.0 6241.0 7889.0    0       NA         NA  273857
#> 
#> SE of above estimates
#>              pool1 pool2 pool3 pool4  psi cap.prob exp factor Pop Est
#> 1              6.1   7.3   4.8     6 64.2    0.001        2.5   10831
#> est unmarked    NA    NA    NA    NA  0.0       NA         NA   10831
#> 
#> 
#> Chisquare gof cutoff  : 0.1 
#> Chisquare gof value   : 38.35232 
#> Chisquare gof df      : 3 
#> Chisquare gof p       : 2.380379e-08

4.2 Pooling over all rows using logical pooling

mod..3a <- SPAS.fit.model(test.data,
                           model.id="Logical pooling to single row",
                           row.pool.in=c(1,1), col.pool.in=1:4, row.physical.pool=FALSE)
#> Using nlminb to find conditional MLE
#> outer mgc:  31865.52 
#> outer mgc:  31209.17 
#> outer mgc:  29588.57 
#> outer mgc:  5095.582 
#> outer mgc:  3430.827 
#> outer mgc:  423.876 
#> outer mgc:  603.7936 
#> outer mgc:  25.22585 
#> outer mgc:  91.84818 
#> outer mgc:  0.6298842 
#> outer mgc:  0.00497048 
#> outer mgc:  3.80732e-07 
#> Convergence codes from nlminb  0 relative convergence (4) 
#> Finding conditional estimate of N
SPAS.print.model(mod..3a)
#> Model Name: Logical pooling to single row 
#>    Date of Fit: 2024-01-25 12:19 
#>    Version of OPEN SPAS used : SPAS-R 2023-03-31 
#>  
#> Raw data 
#>        V1   V2   V3   V4   V5
#> [1,]  160  127   72   82 3592
#> [2,]   24   66   13   10  532
#> [3,] 7960 9720 6264 7934    0
#> 
#> Row pooling setup : 1 1 
#> Col pooling setup : 1 2 3 4 
#> Physical pooling  : FALSE 
#> Theta pooling     : FALSE 
#> CJS pooling       : FALSE 
#> 
#> 
#> Chapman estimator of population size  273430  (SE  10793  )
#>  
#> 
#> Raw data AFTER PHYSICAL (but not logical) POOLING 
#>        pool1 pool2 pool3 pool4   V5
#> pool.1   160   127    72    82 3592
#> pool.1    24    66    13    10  532
#>         7960  9720  6264  7934    0
#> 
#> Condition number of XX' where X= (physically) pooled matrix is  45.64022 
#> Condition number of XX' after logical pooling                   1 
#> 
#> Large value of kappa (>1000) indicate that rows are approximately proportional which is not good
#> 
#>   Conditional   Log-Likelihood: 285423.8    ;  np: 11 ;  AICc: -570825.7 
#> 
#>   Code/Message from optimization is:  0 relative convergence (4) 
#> 
#> Estimates
#>               pool1  pool2  pool3  pool4  psi cap.prob exp factor Pop Est
#> pool.1        121.0  111.4   91.9  122.2 3592    0.017       57.5  236098
#> pool.1         18.1   57.9   16.6   14.9  532    0.017       57.5   37759
#> est unmarked 8005.0 9744.0 6241.0 7889.0    0       NA         NA  273857
#> 
#> SE of above estimates
#>              pool1 pool2 pool3 pool4  psi cap.prob exp factor Pop Est
#> pool.1         6.3   7.5   5.9   6.9 59.9    0.001        2.5    9945
#> pool.1         3.5   6.3   4.3   4.5 23.1    0.001        2.5    1590
#> est unmarked    NA    NA    NA    NA  0.0       NA         NA   10831
#> 
#> 
#> Chisquare gof cutoff  : 0.1 
#> Chisquare gof value   : 38.35233 
#> Chisquare gof df      : 3 
#> Chisquare gof p       : 2.380369e-08

4.3 Pooling over all rows and last two columns using physical pooling

# do physcial complete pooling 
mod..4 <- SPAS.fit.model(test.data,
                           model.id="Physical pooling all rows and last two columns",
                           row.pool.in=c(1,1), col.pool.in=c(12,12,34,34))
#> Using nlminb to find conditional MLE
#> outer mgc:  31867.53 
#> outer mgc:  30927.52 
#> outer mgc:  25741.32 
#> outer mgc:  14099.8 
#> outer mgc:  11055.59 
#> outer mgc:  1852.554 
#> outer mgc:  165.8634 
#> outer mgc:  11.48313 
#> outer mgc:  0.2138227 
#> outer mgc:  7.423679e-05 
#> outer mgc:  2.239631e-11 
#> Convergence codes from nlminb  0 both X-convergence and relative convergence (5) 
#> Finding conditional estimate of N
SPAS.print.model(mod..4)
#> Model Name: Physical pooling all rows and last two columns 
#>    Date of Fit: 2024-01-25 12:19 
#>    Version of OPEN SPAS used : SPAS-R 2023-03-31 
#>  
#> Raw data 
#>        V1   V2   V3   V4   V5
#> [1,]  160  127   72   82 3592
#> [2,]   24   66   13   10  532
#> [3,] 7960 9720 6264 7934    0
#> 
#> Row pooling setup : 1 1 
#> Col pooling setup : 12 12 34 34 
#> Physical pooling  : TRUE 
#> Theta pooling     : FALSE 
#> CJS pooling       : FALSE 
#> 
#> 
#> Chapman estimator of population size  273430  (SE  10793  )
#>  
#> 
#> Raw data AFTER PHYSICAL (but not logical) POOLING 
#>   pool12 pool34   V5
#> 1    377    177 4124
#>    17680  14198    0
#> 
#> Condition number of XX' where X= (physically) pooled matrix is  1 
#> Condition number of XX' after logical pooling                   1 
#> 
#> Large value of kappa (>1000) indicate that rows are approximately proportional which is not good
#> 
#>   Conditional   Log-Likelihood: 309568    ;  np: 4 ;  AICc: -619127.9 
#> 
#>   Code/Message from optimization is:  0 both X-convergence and relative convergence (5) 
#> 
#> Estimates
#>               pool12  pool34  psi cap.prob exp factor Pop Est
#> 1              308.4   245.6 4124    0.017       57.5  273857
#> est unmarked 17749.0 14129.0    0       NA         NA  273857
#> 
#> SE of above estimates
#>              pool12 pool34  psi cap.prob exp factor Pop Est
#> 1              13.2   10.5 64.2    0.001        2.5   10831
#> est unmarked     NA     NA  0.0       NA         NA   10831
#> 
#> 
#> Chisquare gof cutoff  : 0.1 
#> Chisquare gof value   : 34.97117 
#> Chisquare gof df      : 1 
#> Chisquare gof p       : 3.34624e-09

4.4 Complete physical pooling (Pooled Petersen Estimator)

# do physcial complete pooling 
mod..5 <- SPAS.fit.model(test.data,
                           model.id="Physical complete pooling",
                           row.pool.in=c(1,1), col.pool.in=c(1,1,1,1))
#> Using nlminb to find conditional MLE
#> outer mgc:  31868.37 
#> outer mgc:  29339.97 
#> outer mgc:  18211.34 
#> outer mgc:  9394.572 
#> outer mgc:  3418.652 
#> outer mgc:  1125.712 
#> outer mgc:  305.367 
#> outer mgc:  48.36082 
#> outer mgc:  1.890408 
#> outer mgc:  0.003210701 
#> outer mgc:  9.310952e-09 
#> Convergence codes from nlminb  0 relative convergence (4) 
#> Finding conditional estimate of N
SPAS.print.model(mod..5)
#> Model Name: Physical complete pooling 
#>    Date of Fit: 2024-01-25 12:19 
#>    Version of OPEN SPAS used : SPAS-R 2023-03-31 
#>  
#> Raw data 
#>        V1   V2   V3   V4   V5
#> [1,]  160  127   72   82 3592
#> [2,]   24   66   13   10  532
#> [3,] 7960 9720 6264 7934    0
#> 
#> Row pooling setup : 1 1 
#> Col pooling setup : 1 1 1 1 
#> Physical pooling  : TRUE 
#> Theta pooling     : FALSE 
#> CJS pooling       : FALSE 
#> 
#> 
#> Chapman estimator of population size  273430  (SE  10793  )
#>  
#> 
#> Raw data AFTER PHYSICAL (but not logical) POOLING 
#>       1   V5
#> 1   554 4124
#>   31878    0
#> 
#> Condition number of XX' where X= (physically) pooled matrix is  1 
#> Condition number of XX' after logical pooling                   1 
#> 
#> Large value of kappa (>1000) indicate that rows are approximately proportional which is not good
#> 
#>   Conditional   Log-Likelihood: 331838.7    ;  np: 3 ;  AICc: -663671.3 
#> 
#>   Code/Message from optimization is:  0 relative convergence (4) 
#> 
#> Estimates
#>                  1  psi cap.prob exp factor Pop Est
#> 1              554 4124    0.017       57.5  273857
#> est unmarked 31878    0       NA         NA  273857
#> 
#> SE of above estimates
#>                 1  psi cap.prob exp factor Pop Est
#> 1            23.5 64.2    0.001        2.5   10831
#> est unmarked   NA  0.0       NA         NA   10831
#> 
#> 
#> Chisquare gof cutoff  : 0.1 
#> Chisquare gof value   : 1.562464e-19 
#> Chisquare gof df      : 0 
#> Chisquare gof p       : NA

4.5 Comparing the estimates of abundance

#>       .id       date                                       model.id s.a.pool
#> 1  mod..3 2024-01-25                 Physical pooling to single row        1
#> 2 mod..3a 2024-01-25                  Logical pooling to single row        2
#> 3  mod..4 2024-01-25 Physical pooling all rows and last two columns        1
#> 4  mod..5 2024-01-25                      Physical complete pooling        1
#>   t.p.pool logL.cond np      AICc gof.chisq gof.df gof.p   Nhat Nhat.se
#> 1        4  287272.7  6 -574533.3      38.4      3     0 273857   10831
#> 2        4  285423.8 11 -570825.7      38.4      3     0 273857   10831
#> 3        2  309568.0  4 -619127.9      35.0      1     0 273857   10831
#> 4        1  331838.7  3 -663671.3       0.0      0    NA 273857   10831

Notice that the estimates of the population size are identical under logical or physical row pooling.

5 References

Darroch, J. N. (1961). The two-sample capture-recapture census when tagging and sampling are stratified. Biometrika, 48, 241–260. https://www.jstor.org/stable/2332748

Plante, N., L.-P Rivest, and G. Tremblay. (1988). Stratified Capture-Recapture Estimation of the Size of a Closed Population. Biometrics 54, 47-60. https://www.jstor.org/stable/2533994

Schwarz, C. J., & Taylor, C. G. (1998). The use of the stratified-Petersen estimator in fisheries management with an illustration of estimating the number of pink salmon (Oncorhynchus gorbuscha) that return to spawn in the Fraser River. Canadian Journal of Fisheries and Aquatic Sciences, 55, 281–296. https://doi.org/10.1139/f97-238