Logistic Regression

Material of Tue 26 2019, week 4

It’s worth bearing in mind that Logistic regression is basically the same as linear regression for an output value that is categorical, i.e. if linear assumptions hold and \(Y \in \mathbb{Z} \wedge Y = f(X) + \varepsilon \implies f = \sum^n_{i=1} \left[a_i \cdot \frac{e^{\alpha + \beta x_i}}{1+e^{\alpha + \beta x_i}} \right]\).

This is a parametric method because all we need to do now is solve parameter coefficients. This is supervised learning because there are output values.

Question 01: Logistic Regression: Use “Default” data sets

1. Start R studio. Upload “Default” data set built in R and View the data sets using the Console.

def <- read.csv(file = "Datasets/Default.csv")

head(def)
##   X default student   balance    income
## 1 1      No      No  729.5265 44361.625
## 2 2      No     Yes  817.1804 12106.135
## 3 3      No      No 1073.5492 31767.139
## 4 4      No      No  529.2506 35704.494
## 5 5      No      No  785.6559 38463.496
## 6 6      No     Yes  919.5885  7491.559
    # writeLines("\n")
    # print("***Dimensions***")
    writeLines("\n")
dim(def)
## [1] 10000     5
    # writeLines("\n")
    # print("***Summary***")
    # writeLines("\n")
summary(def)
##        X         default    student       balance           income     
##  Min.   :    1   No :9667   No :7056   Min.   :   0.0   Min.   :  772  
##  1st Qu.: 2501   Yes: 333   Yes:2944   1st Qu.: 481.7   1st Qu.:21340  
##  Median : 5000                         Median : 823.6   Median :34553  
##  Mean   : 5000                         Mean   : 835.4   Mean   :33517  
##  3rd Qu.: 7500                         3rd Qu.:1166.3   3rd Qu.:43808  
##  Max.   :10000                         Max.   :2654.3   Max.   :73554
    # writeLines("\n")
    # print("***Structure***")
    # writeLines("\n")
str(def)
## 'data.frame':    10000 obs. of  5 variables:
##  $ X      : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ default: Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 1 1 1 1 ...
##  $ student: Factor w/ 2 levels "No","Yes": 1 2 1 1 1 2 1 2 1 1 ...
##  $ balance: num  730 817 1074 529 786 ...
##  $ income : num  44362 12106 31767 35704 38463 ...
    # writeLines("\n")

Dummy Variables

when glm is used on a categorical variable that is stored as the factor class in R the categorical variables will be encoded as dummy variables (i.e. they will become numbers that are stored as factors ), I thought it might be better to do this yourself because otherwise you have to figure out which dummy variable corresponds to what, however at p. 157 of the textook a different approach is taken; glm was used and then contrasts(def$student) was used to work out which was what, that’s probably the better way to do it.

You shouldn’t work with dummy variables at all, that’s what contrasts are for, work with contrasts1 instead2 of doing it all by hand, it will prevent you from making a mistake.

If values are already numeric but represent categorical variables, it is necessary to ensure that they are of the factor class not numeric/integer/double etc. because otherwise the model will perform sub-optimally. This can be tested by using class().3

If values are integers but need to be considered as discrete values, it is usually better to convert them into factors as well, otherwise models will simply assume they are continuous variables and perform suboptimally (it may be worth preserving two colums in a data frame for the factorised and original data, ocassionally plots will complain about discrete data data and it will be easier to feed them the numeric data).

first convert the data from factors to something else just so we can see how to do that, the easiest in this case is a character (be mindful in R there is no string class, instead there is only the character class, this is distinct from C-like languages such as java), again this can be tested by using class().

There is also typeof() which returns the type of the object from the perspective of R, class() refers to the perspective of an Object-Orientated programming language.

def$default <- as.factor(def$default)
def$student <- as.factor(def$student)
glm(formula = default ~ balance, data = def, family = "binomial")
## 
## Call:  glm(formula = default ~ balance, family = "binomial", data = def)
## 
## Coefficients:
## (Intercept)      balance  
##  -10.651331     0.005499  
## 
## Degrees of Freedom: 9999 Total (i.e. Null);  9998 Residual
## Null Deviance:       2921 
## Residual Deviance: 1596  AIC: 1600
contrasts(def$default)
##     Yes
## No    0
## Yes   1
contrasts(def$student)
##     Yes
## No    0
## Yes   1

2. Plot Income verses Balance (i.e. Plot the Predictors against Each other)

Generally it would be more appropriate to map the output variable to the colour, because colour is easier to see than shape and it is the output variable we are most concerned with.

GGplot

By using the structure from above we can decide to map shape and colour to the default and student variables respectively.

Ge

predplot <- ggplot(data = def, aes(y = balance, x = income, col = default, shape = student)) +
  geom_point(size = 2, alpha = 0.7) +
  labs(y = TeX("Account Balance $(X_2)$"), x = TeX("Income ($X_3$)"), shape = TeX("Student ( X_1 )"), col = "Default (Y)",
       title = "Account Balace against Income and Defaut") +
  theme_classic()

predplot

#ggplotly(predplot) # This does not support latex!

It’s hard to see whether or not there are any defaults in the centre of the plot, hence map the alpha-level to the default and let’s invert this plot so that the split occurs along left/right, that way it will resemble our box plots (this is done simply for want of being able to visualise the data, both the axis are predivitive features so it doesn’t matter):

ggplot(data = def, aes(x = balance, y = income, col = default, shape = student)) +
  geom_point(size = 5, aes(alpha = default)) +
  labs(x = TeX("Account Balance $(X_2)$"), y = TeX("Income ($X_3$)"), shape = TeX("Student ( X_1 )"), col = "Default (Y)",
       title = "Account Balace against Income and Defaut") +
  theme_classic()
## Warning: Using alpha for a discrete variable is not advised.

Now it can be clearly seen that defaults occur at a higher frequency when there is a higher account balance, and maybe slightly higher for students than non-students (there;s mayve very slightly less triangles on the right side than left.)

It seems to be totally independent of income.

BoxPlot

It is necessary to observe the behaviour of the output variable in response to all the predictors, however it is innapropriate to use a scatterplot because it comes out almost useless:

A boxplot is the correct way to do this,4 using the output along the \(x\)-axis and the predictor along the \(y\) axis, we can create two sets of this in ggplot by mapping the x=student and fill = default.

ggplot(data = def, aes(x = student, y = balance, fill = default)) + #The predict
  geom_boxplot() +
  theme_classic() +
  labs(y = TeX("Balance ($X_2$)"), x = TeX("Student ($X_1$)") , fill = TeX("Default ($Y$)"),
       title = TeX("Default given Status as student and Balance")) +
  scale_fill_brewer(palette = "Pastel1")

From here it can clearly be seen that there are some outliers for balance but it is a significant predictorv of defaulting, it can also be seen that generally students are more likely to default but this is a non-significant observation.

In my opinion the appropriate

The textbook chose to visualise the boxplots as below, I think this is undesirable because the interaction of income is already clear from the preceeding plot and because student has a more significant impact than income that should be considered.

balplot <- ggplot(data = def, aes(x = default, y = balance, fill = default)) +
  geom_boxplot() + 
  labs(x = TeX("Default $(Y)$"), y = TeX("Balance $(X_2)$"), title = "Default given Account Balance") +
  theme_classic2() + 
  guides(fill = FALSE) +
  scale_fill_brewer(palette = "Pastel1")


incplot <- ggplot(data = def, aes(x = default, y = income, fill = default)) +
  geom_boxplot() + 
  labs(x = TeX("Default $(Y)$"), y = TeX("Balance $(X_33$"), title = "Default given Account Balance") +
  theme_classic2() +
  guides(fill = FALSE) +
  #scale_fill_discrete(direction = 1, h.start = 99, )
  scale_fill_brewer(palette = "Pastel1")

grid.arrange(balplot, incplot, nrow = 1)

Base Plot

plot(income ~ balance, data = def,
    main = "Credit Defaults",
     xlab = "Income",
     ylab ="Balance",
     pch = c(1, 2)[as.numeric(student)],
     col = c("IndianRed", "Royalblue")[as.numeric(default)], 
    cex = 0.7)

3. Construct the Multiple Linear Logistic Model for default in terms of the Predictors balance, student and income.

In order to create a logistic model use the glm function which is a generalised linear model function, because this is just an exponentiated linear function.

  • Choose family = binomial
    • categorical variables are not normally distributed because they are not continuous, a binomial distribution is the corresponding distribution for discrete data
  • Choose link = logit
    • Use probit for when the seperation of data is distinct, it performs slightly better.
    • the term logit is another word for log odds:
      • \(\log{\left(\frac{\text{P}\left(X\right)}{1 - \text{P}\left(X\right)}\right)}\)
ModDef <- glm(formula = default ~ student + balance + income, family = binomial(link = "logit"), data = def)
contrasts(def$default)
##     Yes
## No    0
## Yes   1

4. Show how you would calculate the probability (the default = “yes”) for a given set of X (predictor) values.

In order to find the probability just use the predict command:

Create the data frame of new data

Use a data frame rather than a list, that way the format of your input, output and predicted data is all equivalent.

newdata <- data.frame(
#   "student" = as.factor(c("Yes", "Yes", "No", "Yes")), # Normal
      #If I wanted to create random points.
    "student" = ifelse(sample(c(0, 1), size = 4, replace = TRUE)==0, "No", "Yes"), 
   "balance" = rnorm(n = 4, 1500, sd = 500),
   "income" = rnorm(n = 4, mean = 50000, sd = 15000)
   )

Now call the predict command and use cbind to combine the newdata with the predicted data.

By default the predict command will output the value of the logit or the log-odds:5

\[ \verb|output| = \log\left( \frac{ \text{Pr}\left(X\right) }{1 - \text{Pr}\left(X\right) } \right) \] Clearly the raw probability would be considerably more useful to us, hence specify the parameter type="Response":

predVals <- predict(object = ModDef, newdata, type = "response") 
cbind("default_prob" = predVals, newdata)
##   default_prob student   balance   income
## 1  0.003597183     Yes  996.0236 58720.54
## 2  0.718389588     Yes 2131.6236 73921.83
## 3  0.106698601     Yes 1611.5783 48155.68
## 4  0.934770901      No 2334.8403 45366.55

all the corresponding values can be returned by not specifying the newdata parameter in the predict command, it would also be possible to return the fitted.values from the model object, the fitted values will represent the probability because it is the natural position along the \(y\)-axis, it is the fitted value

ModelPreds <- cbind(def, "Probability" = ModDef$fitted.values)

#ModelPreds <- cbind(def, "Probability" = predict(ModDef, type = 'response'))   #This would do the same thing

In order to move from the probability to the decision the threshold would have to be specified:

threshold <- 0.5
ModelPreds$prediction <- rep_len(x ="No", length.out = nrow(ModelPreds))
ModelPreds$prediction[ModelPreds$Probability > threshold] <- "Yes"
ModelPreds$prediction <- as.factor(ModelPreds$prediction)
ModelPreds <- ModelPreds[,c(2,6,7,3,4,5)]
head(ModelPreds)
##   default  Probability prediction student   balance    income
## 1      No 0.0014287239         No      No  729.5265 44361.625
## 2      No 0.0011222039         No     Yes  817.1804 12106.135
## 3      No 0.0098122716         No      No 1073.5492 31767.139
## 4      No 0.0004415893         No      No  529.2506 35704.494
## 5      No 0.0019355062         No      No  785.6559 38463.496
## 6      No 0.0019895182         No     Yes  919.5885  7491.559
#Alternative approach
#threshold <- 0.5
#Predictions <- ifelse(ModelPreds$Probability < threshold, 0, 1)
#ModelPreds <- cbind(ModelPreds, "Prediction" = Predictions)

5. Construct the Misclassification Martrix

The misclassification matrix (or confusion matrix) is a matrix for the False and True Positives, it can be a good idea to wrap this in a function because later on it may be necessary to use a loop to return a ROC curve.

Construct functions to Compute FPR and TPR

base

This is the method that oliver taught us

This is the harder way to do it, this is what Oliver taught us
trupos <- ifelse(ModelPreds$default == ModelPreds$Prediction, 1, 0 )             #fpr
trupos <- sum(trupos)
trupos

FalsePos <- nrow(ModelPreds) - trupos

FalseNeg <- ifelse(ModelPreds$)

This is a more elegent strategy using table(predictions, observations), the order should be specified as such by convention I have seen in the TB and online, this way also matches the confusionMatrix() functoin inside the caret package

By convention the rows will represent the predicted values and the columns will represent the observed values. . however one page of the textbook shows it in print the otherway but generates in code this way so that’s a little confusing.

conf.mat <- table(ModelPreds$prediction, ModelPreds$default, dnn = c("Predicted", "Observed"))
print(conf.mat)
##          Observed
## Predicted   No  Yes
##       No  9627  228
##       Yes   40  105
# It can be a little difficult to determine which one is the prediction using table
  #if you don't get the order of prediction then observation right, so
    # you can also use `confusionMatrix` from the `caret` package to triple check.


conf.mat2 <- confusionMatrix(ModelPreds$prediction, ModelPreds$default)
print(conf.mat2)
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction   No  Yes
##        No  9627  228
##        Yes   40  105
##                                           
##                Accuracy : 0.9732          
##                  95% CI : (0.9698, 0.9763)
##     No Information Rate : 0.9667          
##     P-Value [Acc > NIR] : 0.0001044       
##                                           
##                   Kappa : 0.4278          
##                                           
##  Mcnemar's Test P-Value : < 2.2e-16       
##                                           
##             Sensitivity : 0.9959          
##             Specificity : 0.3153          
##          Pos Pred Value : 0.9769          
##          Neg Pred Value : 0.7241          
##              Prevalence : 0.9667          
##          Detection Rate : 0.9627          
##    Detection Prevalence : 0.9855          
##       Balanced Accuracy : 0.6556          
##                                           
##        'Positive' Class : No              
## 

6. Calculate the Prob (Misclassification) or Misclassification Rate

In order to determine the missclassification rate we would, in this case the diagonals are the correct predictions and the off-diagonals are misclassifications

misclass <- (40+228)/(9627+228+40+105)

print(paste("The misclassification rate is: ", percent(misclass)))
## [1] "The misclassification rate is:  2.68%"

in practice we would just take the mean value of the missclassifications, because under the hood true and false correspond to 1/0 in R.

misclass <- mean(ModelPreds$default != ModelPreds$prediction) 
print(paste("The misclassification rate is: ", percent(misclass)))
## [1] "The misclassification rate is:  2.68%"

7. Calculate the FPR and FNR

Incorrect will be off-diagonals.

The False positive rate will be the number of times that the prediction is positive but the observation is negative, so the second row, first column of the confusion matrix.

The False Negative rate will be the number of times that the prediction is negative but the observation is positive, so first row second column:

conf.mat
##          Observed
## Predicted   No  Yes
##       No  9627  228
##       Yes   40  105
FPR <- conf.mat[2,1] / nrow(ModelPreds)
FNR <- conf.mat[1,2] / nrow(ModelPreds)

The false positive rate is 0.004 and the False Negative rate is 0.0228

Choose the ideal threshold based on the ROC curve

Terminology

The true positive rate is the number of positives that are true divided by the number of positives observed $$ = \  \ ( 1- ) =

$$

\[ \textit{Sensitivity} \text{ is } \textbf{TruePosRate}\\ \textit{Specificity} \text{ is } \textbf{TrueNegRate} \]

A ROC Curve is the TruePosRate (i.e. the Sensitivity) on the \(y\)-axis against the FalsePosRate (i.e. 1-Sensitivity) on the x-axis:

First Create a tpr function:

tpr <- function(model, observation, threshold, silent){
   probability <- model$fitted.values
  
  prediction <- rep_len(x ="No", length.out = length(observation))
  prediction[probability > threshold] <- "Yes"
  prediction <- as.factor(prediction)
  
   conf.mat <- table(prediction, observation, dnn = c("Predicted", "Observed"))
    conf.mat2 <- confusionMatrix(prediction, observation)
    if (!silent) {
    print(conf.mat2)
    }
    
     tpr <- conf.mat2[["byClass"]][["Sensitivity"]] 
#   tpr <- 1- conf.mat[2,2]/(conf.mat[2,2] + conf.mat[1,1])
    fnr <- 1-conf.mat2[["byClass"]][["Specificity"]] 

    return(c(tpr, fnr))
    
}

tpr(ModDef, def$default, threshold = 0.5, silent = FALSE)
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction   No  Yes
##        No  9627  228
##        Yes   40  105
##                                           
##                Accuracy : 0.9732          
##                  95% CI : (0.9698, 0.9763)
##     No Information Rate : 0.9667          
##     P-Value [Acc > NIR] : 0.0001044       
##                                           
##                   Kappa : 0.4278          
##                                           
##  Mcnemar's Test P-Value : < 2.2e-16       
##                                           
##             Sensitivity : 0.9959          
##             Specificity : 0.3153          
##          Pos Pred Value : 0.9769          
##          Neg Pred Value : 0.7241          
##              Prevalence : 0.9667          
##          Detection Rate : 0.9627          
##    Detection Prevalence : 0.9855          
##       Balanced Accuracy : 0.6556          
##                                           
##        'Positive' Class : No              
## 
## [1] 0.9958622 0.6846847

Now run a loop to try every threshold

Question 02: Logistic Regression: Use “Smarket” data set.

1. Start R studio. Upload “Smarket” data set built in R and View the data sets using the Console.

head(Smarket)
##   Year   Lag1   Lag2   Lag3   Lag4   Lag5 Volume  Today Direction
## 1 2001  0.381 -0.192 -2.624 -1.055  5.010 1.1913  0.959        Up
## 2 2001  0.959  0.381 -0.192 -2.624 -1.055 1.2965  1.032        Up
## 3 2001  1.032  0.959  0.381 -0.192 -2.624 1.4112 -0.623      Down
## 4 2001 -0.623  1.032  0.959  0.381 -0.192 1.2760  0.614        Up
## 5 2001  0.614 -0.623  1.032  0.959  0.381 1.2057  0.213        Up
## 6 2001  0.213  0.614 -0.623  1.032  0.959 1.3491  1.392        Up

2. Construct the Multiple Linear Logistic Model for Direction in terms of the Predictors Lag1, Lag2, Lag3, Lag4, Lag5 and Volume.

#Check that it's a factor
class(Smarket$Direction)
## [1] "factor"
#Make it a factor
Smarket$Direction <- as.factor(Smarket$Direction)

LogModSmarket <- glm(formula = Direction ~ Lag1 + Lag2 + Lag3 + Lag4 + Lag5 + Volume, data = Smarket, family = binomial(link = "logit"))

3. Show how you would calculate the probability (the Direction = “up”) for a given set of X (predictor) values.

Generate some newdata

n <- 4
maxval <- max(Smarket$Today)
minval <- min(Smarket$Today)

newdata <- data.frame(
                      Lag1 = sample(seq(from = minval, to = maxval, length.out = 1000), size = n),
                      Lag2 = sample(seq(from = minval, to = maxval, length.out = 1000), size = n),
                      Lag3 = sample(seq(from = minval, to = maxval, length.out = 1000), size = n),
                      Lag4 = sample(seq(from = minval, to = maxval, length.out = 1000), size = n),
                      Lag5 = sample(seq(from = minval, to = maxval, length.out = 1000), size = n),
                      Volume = sample(seq(
                        from = min(Smarket$Volume), to = max(Smarket$Volume), length.out = 1000)
                        , size = n)
                      )

Now feed this data into the model:

predicteddata <- predict(object = LogModSmarket, newdata, type = "response")
predicteddata <- data.frame(Probability = predicteddata, newdata )

In order to create the predictions use contrasts to check how the dummy variables have been assigned.

contrasts(Smarket$Direction)
##      Up
## Down  0
## Up    1

Hence Up is 1 so in order to create the prediction from the probability (assuming an 0.5 threshold):

predicteddata$Predictions <- "Down"
predicteddata$Predictions[predicteddata$Probability > 0.5] <- "Up"

# Don't forget that it needs to be a factor,
  # Because of some really strange behaviour in R this MUST be done last
predicteddata$Predictions <- as.factor( predicteddata$Predictions)
predicteddata <- predicteddata[,c(8, 1:7)]

# If Else could also have been used like this
    #This is neat because it's using if/else across data without a loop
      #Just rember that this returs values, it won't assign them inside like a
      #loop otherwise would

predicteddata$Predictions <- ifelse(predicteddata$Probability > 0.5, "Up", "Down")

predicteddata
##   Predictions Probability       Lag1       Lag2      Lag3      Lag4
## 1          Up   0.5013020 -0.8583814  3.3012282 3.4932102 -4.335388
## 2          Up   0.5845357 -0.8263844 -0.8477157 4.1331502 -3.492801
## 3        Down   0.4970459 -2.2235866  4.2398068 4.0478248 -3.716780
## 4        Down   0.4439511  0.8587908  3.3332252 0.6134805 -1.988942
##         Lag5    Volume
## 1 -4.6446927 1.9040272
## 2 -0.9117097 2.7129957
## 3 -4.8580060 1.2630105
## 4 -1.6583063 0.9858898

4. Construct the Misclassification Martrix

First call upon the model (or predict) to get the modelled values of the data:

  • Calling from the model with glm(Y~X)$fitted.values will always return the probabilities, The fitted values are the y-values, naturally the prob.
  • If you want to youse predict, don’t specify new data and it will by default return the training data, however the values will be the log-odd values, make sure to specify type=response
# Calling from the model will always return the probabilities, 
# The fitted values are the y-values, naturally the prob.

preds <- LogModSmarket$fitted.values

# If you want to youse predict, don't specify new data and it
#will by default return the training data, however the values 
# will be the log-odd values, make sure to specify
#type=response

probs <- predict(LogModSmarket, type = 'response')

preds <- ifelse(probs > 0.5, "Up", "Down")
preds <- as.factor(preds)

The misclassification matrix may be made with:

  • The built in table command
    • When using this you have to remember to do it in the order prediction, observation
  • the confusionmatrix() command from the caret package
    • the advantage to this is that it will automatically label things and return the tpr
conf.mat <- table(preds, Smarket$Direction, dnn = c("Predictions", "Observations"))
conf.mat
##            Observations
## Predictions Down  Up
##        Down  145 141
##        Up    457 507
# Using confusionMatrix (recall that the `contrasts` told us up was one,
  # hence that is specified as positive)
conf.obj <- confusionMatrix(data = preds, reference = Smarket$Direction )
conf.obj
## Confusion Matrix and Statistics
## 
##           Reference
## Prediction Down  Up
##       Down  145 141
##       Up    457 507
##                                           
##                Accuracy : 0.5216          
##                  95% CI : (0.4935, 0.5496)
##     No Information Rate : 0.5184          
##     P-Value [Acc > NIR] : 0.4216          
##                                           
##                   Kappa : 0.0237          
##                                           
##  Mcnemar's Test P-Value : <2e-16          
##                                           
##             Sensitivity : 0.2409          
##             Specificity : 0.7824          
##          Pos Pred Value : 0.5070          
##          Neg Pred Value : 0.5259          
##              Prevalence : 0.4816          
##          Detection Rate : 0.1160          
##    Detection Prevalence : 0.2288          
##       Balanced Accuracy : 0.5116          
##                                           
##        'Positive' Class : Down            
## 

5. Calculate the Prob (Misclassification) or Misclassification Rate

The diagonals of the matrix represent the correct classifications and the off-diagonals represent the misclassifications, so the misclassification rate is:

(conf.mat[1,2] + conf.mat[2,1])/sum(conf.mat)
## [1] 0.4784
# This may also have been returned from 
1-conf.obj$overall["Accuracy"]
## Accuracy 
##   0.4784

6. Calculate the False Positive rate

The false positive rate is:

\[\begin{align} \textsf{FPR}&= \frac{FP}{Neg}\\ &= \frac{FP}{FP + TN}\\ &= 1 - TNR \end{align}\]

If/when you forget that during the exam remember you can just use confusionMatrix() command inside the caret package.

#I've made a mistake here, down is positive when I would have rathered it the other way and is too late in the evening to fix that,
# It makes it a mess but whatever.

# From the confusion Matrix
conf.mat
##            Observations
## Predictions Down  Up
##        Down  145 141
##        Up    457 507
FPR <- (conf.mat[1,2])/(sum(conf.mat[,2]))

# Using `caret::confusionMatrix()`
# Specificity is the TNR
  #FPR is 1-TNR so use
FPR
## [1] 0.2175926
1-conf.obj$byClass["Specificity"][1]
## Specificity 
##   0.2175926

7. Calculate the False Negative Rate

The False Negative rate may be determined by swapping every occurence of positive with negative and vice versa, hence;

The False Negative Rate is given by:

\[\begin{align} \textsf{FNR}&= \frac{FN}{Pos}\\ &= \frac{FN}{FN + TP}\\ &= 1 - TPR \end{align}\]

#I've fucked up here, dowsn is positive and is too late in the evening to fix that,
# It makes it a mess but whatever.

# From the confusion Matrix
  # It's false so we are concerned with the offset
    # Just choose the other offset and the other column
FNR <- (conf.mat[2,1])/(sum(conf.mat[,1])) 

# Using `caret::confusionMatrix()`
# Specificity is the TNR
  #FNR is 1-TPR so use Sensitivity instead
FNR
## [1] 0.7591362
1-conf.obj$byClass["Sensitivity"][1]
## Sensitivity 
##   0.7591362

James, Gareth. 2017. An Introduction to Statistical Learning. Springer.


  1. R Documentation on Contrasts↩︎

  2. Data Camp: Logistic Regression Tutorial↩︎

  3. DataMentor and StackOverflow and StackExchange↩︎

  4. Refer to page 129 of ISL (James 2017)↩︎

  5. Refer to page 133 Equation 4.4 of ISL↩︎