plot.fft() function

Nathaniel Phillips

2016-07-12

This function visualizes FFTs. The best way to use the function is to create an fft object with fft(), then apply plot() to the object

Creating an fft object

Let’s start with an example, we’ll create an fft object called heart.fft from the heartdisease dataset:

set.seed(100) # For reproducability due to training / testing data split

heart.fft <- fft(
  train.cue.df = heartdisease[,names(heartdisease) != "diagnosis"],
  train.criterion.v = heartdisease$diagnosis,
  train.p = .5,
  max.levels = 4
  )

Visualizing individual trees

Once you’ve created an fft object using fft() you can visualize the tree (and ROC curves) using plot(). There are two main arguments:

Let’s plot the best training tree for the heartdisease data when applied to the test dataset:

plot(heart.fft,
     which.tree = "best.train",
     which.data = "test",
     description = "Heart Disease",
     decision.names = c("Healthy", "Disease")
     )

Here’s how to interpret this tree: if thal is greater than 3, classify as signal (“+ disease”). For this dataset, 72 cases (18 true noise + 54 true signal) were classified as signal while the remaining 80 cases (152 - 72) moved to the next level. Now, if cp is less than 4, classify as noise (“- disease”). Here, of the remaining 80 cases, 50 (45 true noise and 5 true signal) were classified as noise (“- disease”). This left 30 cases which were classified at the final level.

Cumulative classification statistics are showin in the bottom panel of the plot.

Now let’s compare the result to tree # 1. This is a very conservative tree with Noise exits on all but the last branch. This tree should have a very low false-alarm rate but (unfortunately) also a very low hit-rate.

plot(heart.fft,
     which.tree = 1,
     which.data = "test",
     description = "Heart Disease",
     decision.names = c("Healthy", "Disease")
     )

Indeed, this tree is very conservative: only 23 cases were classified as signals (“+ disease”) at the very end of the tree.

ROC curves

You can also plot ROC curves, showing the cumulative HR and FAR of each of the trees, by specifying roc = T. Because all trees (and both training and test datasets) are plotted, no additional arguments are necessary.

Here are the fitting and training ROC for the heart.fft object. The best training tree is specified with large filled symbols. The circle is for the training data, the triangle is for the test data.

plot(heart.fft,
     roc = T
     )

Here, the value of the triangle (FAR = 24%, HR = 88%) corresponds to our individual tree plot above (i.e.; the best fittting tree applied to the test data). The point towards the bottom left of the plot (FAR = 0%, HR = 32%) corresponds to tree 2 (the very conservative tree).

Including LR and CART

You can also include ROC curves for Logistic Regression and CART by including lr = T and/or cart = T:

plot(heart.fft,
     roc = T,
     lr = T,
     cart = T
     )

Here, we can see that for data fitting (circles and dashed lines), the FFTs did worse than LR and CART. However, for fitting (triangles and solid lines) the trees outperformed both LR and CART.