Now that we have our model ready, we can predict work satisfaction in the testing dataset.
One way to do so is simply to use the predict()
function. The allow.new.levels
argument specifies that we allow new hospitals in the analysis. As we have the same hospitals in the training and testing sets, we set its value to F
(false) (which is actually the default value):
NursesMLtest$predicted = predict(modelRS, NursesMLtest, allow.new.levels = F)
There is no perfect way to measure the quality of the predictions for nested data. A simple estimate of the quality of our prediction is the correlation test. Because of the nested structure of our dataset, we will perform the test for each hospital separately:
1 correls = matrix(nrow=17,ncol=3) 2 colnames(correls) = c("Correlation", "p value", "r squared") 3 for (i in 1:17){ 4 dat = subset(NursesMLtest, hosp == i) 5 correls[i,1] = cor.test(dat$predicted...