`matsbyname`

Matrices are important mathematical objects, and they often describe networks of flows among nodes. Example networks are given in the following table.

System type | Flows | Nodes |
---|---|---|

Ecological | nutrients | organisms |

Manufacturing | materials | factories |

Economic | money | economic sectors |

The power of matrices lies in their ability to organize network-wide calculations, thereby simplifying the work of analysts who study entire systems. However, three problems arise when performing matrix operations in `R` and other languages.

Although built-in matrix functions ensure size conformity of matrix operands, they do not respect the names of rows and columns (known as `dimnames` in `R`). In the following example, **U** represents a *use*matrix that contains the quantity of each product used by each industry, and **Y** represents a *final demand* matrix that contains the quantity of each product consumed by final demand industries. If the rows and columns are not in the same order, the sum of the matrices is nonsensical.

```
productnames <- c("p1", "p2")
industrynames <- c("i1", "i2")
U <- matrix(1:4, ncol = 2, dimnames = list(productnames, industrynames))
U
#> i1 i2
#> p1 1 3
#> p2 2 4
Y <- matrix(1:4, ncol = 2, dimnames = list(rev(productnames), rev(industrynames)))
Y
#> i2 i1
#> p2 1 3
#> p1 2 4
# This sum is nonsensical. Neither row nor column names are respected.
U + Y
#> i1 i2
#> p1 2 6
#> p2 4 8
```

As a result, analysts performing matrix operations must maintain strict order of rows and columns across all calculations.

In many cases, operand matrices may have different numbers or different names of rows or columns. This situation can occur when, for example, products or industries changes across time periods. When performing matrix operations, rows or columns of zeros must be added to ensure name conformity.

```
Y3 <- matrix(5:8, ncol = 2, dimnames = list(c("p1", "p3"), c("i1", "i3")))
Y3
#> i1 i3
#> p1 5 7
#> p3 6 8
# Nonsensical because neither row nor column names are respected.
# The "p3" rows and "i3" columns of Y3 have been added to
# "p2" rows and "i2" columns of U.
# Row and column names for the sum are taken from the first operand (U).
U + Y3
#> i1 i2
#> p1 6 10
#> p2 8 12
# Rather, need to insert missing rows in both U and Y before summing.
U_2000 <- matrix(c(1, 3, 0,
2, 4, 0,
0, 0, 0),
ncol = 3, byrow = TRUE,
dimnames = list(c("p1", "p2", "p3"), c("i1", "i2", "i3")))
Y_2000 <- matrix(c(5, 0, 7,
0, 0, 0,
6, 0, 8),
ncol = 3, byrow = TRUE,
dimnames = list(c("p1", "p2", "p3"), c("i1", "i2", "i3")))
U_2000
#> i1 i2 i3
#> p1 1 3 0
#> p2 2 4 0
#> p3 0 0 0
Y_2000
#> i1 i2 i3
#> p1 5 0 7
#> p2 0 0 0
#> p3 6 0 8
U_2000 + Y_2000
#> i1 i2 i3
#> p1 6 3 7
#> p2 2 4 0
#> p3 6 0 8
```

The analyst’s burden is cumbersome. But worse problems await.

Respecting names (and adding rows and columns of zeroes) can lead to an inability to invert matrices downstream, as shown in the following example.

```
# The original U matrix is invertible.
solve(U)
#> p1 p2
#> i1 -2 1.5
#> i2 1 -0.5
# The version of U that contains zero rows and columns (U_2000)
# is singular and cannot be inverted.
tryCatch(solve(U_2000), error = function(err){print(err)})
#> <simpleError in solve.default(U_2000): Lapack routine dgesv: system is exactly singular: U[3,3] = 0>
```

Matrix functions provided by `R` and other languages do not ensure type conformity for matrix operands to matrix algebra functions. In the example of matrix multiplication, columns of the multiplicand must contain the same type of information as the as the rows of the multiplier. If the columns of **A** are countries, then the rows of **B** must also be countries (and in the same order) if **A** `%*%` **B** is to make sense.

The `matsbyname` package automatically addresses all three problems above. It performs smart matrix operations that

- respect row and column names
- by inserting rows and columns of zeroes as necessary and
- by re-ordering rows and columns to ensure conformity of the names of operand rows and columns, and

- respect row and column types, enforcing conformity as appropriate.

These features are available without analyst intervention, as shown in the following example.

```
# Same as U + Y2, without needing to create Y2.
sum_byname(U, Y)
#> i1 i2
#> p1 5 5
#> p2 5 5
# Same as U_2000 + Y_2000, but U and Y3 are unmodified.
sum_byname(U, Y3)
#> i1 i2 i3
#> p1 6 3 7
#> p2 2 4 0
#> p3 6 0 8
# Eliminate zero-filled rows and columns. Same result as solve(U).
U_2000 %>% clean_byname(margin = c(1,2), clean_value = 0) %>% solve()
#> p1 p2
#> i1 -2 1.5
#> i2 1 -0.5
```

In addition to `sum_byname` and `clean_byname`, the `matsbyname` package contains many additional matrix algebra functions that respect the names of rows and columns. Commonly-used functions are:

`sum_byname()`

`difference_byname()`

`hadamardproduct_byname()`

`matrixproduct_byname()`

`quotient_byname()`

`rowsums_byname()`

`colsums_byname()`

`invert_byname()`

, and`transpose_byname()`

.

The full list of functions can be found with `?matsbyname` and clicking the `Index` link.

Furthermore, `matsbyname` works well with its sister package, `matsindf`. `matsindf` creates data frames whose entries are not numbers but entire matrices, thereby enabling the use of `matsbyname` functions in `tidyverse` functional programming.

When used together, `matsbyname` and `matsindf` allow analysts to wield simultaneously the power of both matrix mathematics and `tidyverse` functional programming.

This vignette demonstrates the power of matrix mathematics performed `by name`.

The `matsbyname` package has several features that both simplify analyses and ensure their correctness.

In the preceding examples, row and column names were provided by the `dimnames` argument to the `matrix` function. However, `matsbyname` provides the `setcolnames_byname` and `setrownames_byname` functions to perform the same tasks using the pipe operator (`%>%`).

Row and column types can be understood by analogy: row and column types are to matrices in matrix algebra as units are to scalars in scalar algebra. Just as careful tracking of units is necessary in scalar calculations, careful tracking of row and column types is necessary in matrix operations. Because `matsbyname` keeps track of row and column types automatically, much of the burden of dealing with row and column types is removed from the analyst.

Row and column types are character strings stored as attributes of the matrix object, and `matsbyname` functions ensure correctness of matrix operations by checking row and column types, throwing errors if needed. Row and column types can be set by the functions `setrowtype` and `setcoltype` and retrieved by the functions `rowtype` and `coltype`. Consider matrices **A**, **B**, and **C**:

```
A <- matrix(1:4, ncol = 2) %>%
setrownames_byname(productnames) %>% setcolnames_byname(industrynames) %>%
setrowtype("Products") %>% setcoltype("Industries")
A
#> i1 i2
#> p1 1 3
#> p2 2 4
#> attr(,"rowtype")
#> [1] "Products"
#> attr(,"coltype")
#> [1] "Industries"
B <- matrix(8:5, ncol = 2) %>%
setrownames_byname(productnames) %>% setcolnames_byname(industrynames) %>%
setrowtype("Products") %>% setcoltype("Industries")
B
#> i1 i2
#> p1 8 6
#> p2 7 5
#> attr(,"rowtype")
#> [1] "Products"
#> attr(,"coltype")
#> [1] "Industries"
C <- matrix(1:4, ncol = 2) %>%
setrownames_byname(industrynames) %>% setcolnames_byname(productnames) %>%
setrowtype("Industries") %>% setcoltype("Products")
C
#> p1 p2
#> i1 1 3
#> i2 2 4
#> attr(,"rowtype")
#> [1] "Industries"
#> attr(,"coltype")
#> [1] "Products"
```

**B** can be added to **A**, because row and column types are identical.

```
sum_byname(A, B)
#> i1 i2
#> p1 9 9
#> p2 9 9
#> attr(,"rowtype")
#> [1] "Products"
#> attr(,"coltype")
#> [1] "Industries"
```

However, **C** cannot be added to **A** (or **B**), because row and column types disagree.

```
tryCatch(sum_byname(A, C), error = function(err){print(err)})
#> <simpleError in organize_args(a, b, fill = 0, match_type = match_type): rowtype(a) (Products) != rowtype(b) (Industries).>
```

In this case, a sum is possible if **C** is transposed prior to adding to **A**, because row and column types of **A** and **C**^{T} agree.

```
sum_byname(A, transpose_byname(C))
#> i1 i2
#> p1 2 5
#> p2 5 8
#> attr(,"rowtype")
#> [1] "Products"
#> attr(,"coltype")
#> [1] "Industries"
```

Matrices **A** and **B** can be element-multiplied and element-divided for the same reason they can be summed: row and column types agree.

```
hadamardproduct_byname(A, B)
#> i1 i2
#> p1 8 18
#> p2 14 20
#> attr(,"rowtype")
#> [1] "Products"
#> attr(,"coltype")
#> [1] "Industries"
quotient_byname(A, B)
#> i1 i2
#> p1 0.1250000 0.5
#> p2 0.2857143 0.8
#> attr(,"rowtype")
#> [1] "Products"
#> attr(,"coltype")
#> [1] "Industries"
```

Note that **A** and **C** can be matrix-multiplied, because the column type of **A** and the row type of **C** are identical (`Industries`). The result is a `Products`-by-`Products` matrix.

```
matrixproduct_byname(A, C)
#> p1 p2
#> p1 7 15
#> p2 10 22
#> attr(,"rowtype")
#> [1] "Products"
#> attr(,"coltype")
#> [1] "Products"
```

However, **A** and **B** cannot be matrix-multiplied, because the column type of **A** (`Industries`) and the row type of **B** (`Products`) are different.

```
tryCatch(matrixproduct_byname(A, B), error = function(err){print(err)})
#> <simpleError in organize_args(a, b, fill = 0, match_type = match_type): coltype(a) != rowtype(b): Industries != Products.>
```

Analysts are encouraged to set row and column types on matrices, thereby taking advantage of `matsbyname`’s type-tracking feature to improve their matrix-based analyses.

Another feature of the `matsbyname` package is that it works when arguments to functions are lists of matrices, returning lists as appropriate.

```
sum_byname(A, list(B, B))
#> [[1]]
#> i1 i2
#> p1 9 9
#> p2 9 9
#> attr(,"rowtype")
#> [1] "Products"
#> attr(,"coltype")
#> [1] "Industries"
#>
#> [[2]]
#> i1 i2
#> p1 9 9
#> p2 9 9
#> attr(,"rowtype")
#> [1] "Products"
#> attr(,"coltype")
#> [1] "Industries"
hadamardproduct_byname(list(A, A), B)
#> [[1]]
#> i1 i2
#> p1 8 18
#> p2 14 20
#> attr(,"rowtype")
#> [1] "Products"
#> attr(,"coltype")
#> [1] "Industries"
#>
#> [[2]]
#> i1 i2
#> p1 8 18
#> p2 14 20
#> attr(,"rowtype")
#> [1] "Products"
#> attr(,"coltype")
#> [1] "Industries"
matrixproduct_byname(list(A, A), list(C, C))
#> [[1]]
#> p1 p2
#> p1 7 15
#> p2 10 22
#> attr(,"rowtype")
#> [1] "Products"
#> attr(,"coltype")
#> [1] "Products"
#>
#> [[2]]
#> p1 p2
#> p1 7 15
#> p2 10 22
#> attr(,"rowtype")
#> [1] "Products"
#> attr(,"coltype")
#> [1] "Products"
```

The `matsbyname` package simplifies analyses in which row and column names ought to be respected. It provides optional row and column types, thereby ensuring that only valid matrix operations are performed. Finally, `matsbyname` functions work equally well with lists to allow use of `*_byname` functions with `tidyr` and `dplyr` approaches to manipulating data.