Basic_Operations

Philip I. Pavlik Jr.

2023-12-11

LKT (Logistic Knowledge Tracing) Framework

To use LKT, one needs to have: * Terms of the model, each including a component level that can be characterized by a feature that describes change across repetitions. * A sequence of learner event data with user id and correctness columns (at barest minimum).

Component Level

The component specifies the subsets of the data for which the feature applies. We might think of the most basic component as the subject itself. There are other components as well, such as the items and knowledge components. In the model, the effects for each feature for each component sum together to compute the additive effect of multiple features. It is assumed that, except for constant intercepts, a feature is applied for all component levels individually within the data for each subject.

Features

These are the functions for computing the effect of the components’ histories for each student (except for the fixed feature, the constant intercept). Some features have a single term like exponential decay (expdecafm, described in 2.C.6), which is a transform using the sequence of prior trials and a decay parameter. Other features are inherently interactive, such as base2, which scales the logarithmic effect of practice by multiplying by a memory decay effect term. Other terms like base4 and ppe involve the interaction of at least 3 inputs. Table 1 summarizes 25 features currently supported in the LearnSphere, indicating if the feature is linear, adaptive, and/or dynamic and explaining the input data required from the knowledge components.

Feature Types

The standard feature type (except for intercept, which is always “extended”) is fit with the same coefficient for all levels of the component factor. Features may also be extended with the $ operator, which causes LKT to “extend” the feature to fit a coefficient for each level of the component factor. The most straightforward example of this extension is for KCs. Typically, models have used a different coefficient for each knowledge component. For example, in AFM, each KC gets a coefficient to characterize how fast it is learned across opportunities specified in the notation with a $ operator in LKT. If a $ operator is not present, a single coefficient is fit for the feature.

Intercept features can also be modified with the @ operator, which produces random intercepts instead of the default fixed intercepts. This feature was included primarily to show the initial comparisons of options for modeling student individual differences in this paper. For this paper, we wished to compare fixed-effect and random-effect intercepts with other methods for identifying subject variability, like propdec, propdec2, and logitdec.

Learner Data Requirements

The LKT model relies on data being in the DataShop format, but only some columns are needed for the models. See the data example below for the minimal format. Data is assumed to be consecutive, grouped by user ids.

Column Requirements for computeSpacingPredictors and Dependencies

Main Function: computeSpacingPredictors

Required Columns

  • Anon.Student.Id
  • CF..ansbin.

Optionally Generated Columns

  • CF..reltime. (Generated by practiceTime if not present)
  • CF..Time. (Generated if not present)

Dependencies

practiceTime

  • Requires: CF..ansbin., Anon.Student.Id, Duration..sec.

componentspacing

  • Requires: index (Generated in the main function), CF..Time. or CF..reltime.

componentprev

  • Requires: index (Generated in the main function), CF..ansbin.

meanspacingf

  • Requires: index (Generated in the main function), ${i}spacing

laggedspacingf

  • Requires: index (Generated in the main function), ${i}spacing

Summary

To effectively use computeSpacingPredictors and its dependencies, the input dataset should minimally contain Anon.Student.Id and CF..ansbin.. Additionally, CF..reltime. and CF..Time. are beneficial but can be generated. The Duration..sec. column is specifically required by the practiceTime function.

Note on ${i}

The placeholder ${i} denotes that multiple columns could be involved, depending on the KCs specified. For example, ${i}spacing could translate to columns like Math.spacing, Science.spacing, etc., based on what is passed in the KCs parameter.

Example Data

The data set for examples is shown below:

Anon.Student.Id Duration..sec. Outcome KC..Default.
Stu_0391448da5eac00f9b6dd455081aa08e 54 CORRECT 1-3 A norm
Stu_0391448da5eac00f9b6dd455081aa08e 12 INCORRECT 13-3 The u
Stu_0391448da5eac00f9b6dd455081aa08e 23 INCORRECT 14-2 The v
Stu_0391448da5eac00f9b6dd455081aa08e 16 CORRECT 0-3 A dist
Stu_0391448da5eac00f9b6dd455081aa08e 16 INCORRECT 7-3 The me
Stu_0391448da5eac00f9b6dd455081aa08e 25 CORRECT 1-3 A norm
Stu_0391448da5eac00f9b6dd455081aa08e 10 INCORRECT 8-3 The no
Stu_0391448da5eac00f9b6dd455081aa08e 24 INCORRECT 13-3 The u
Stu_0391448da5eac00f9b6dd455081aa08e 23 CORRECT 17-3 When
Stu_0391448da5eac00f9b6dd455081aa08e 9 INCORRECT 4-3 Standa

LKT paper under review please see Pavlik, Eglington, and Harrel-Williams (2021) <arXiv:2005.00869>