After performing an ANOVA test, an analyst has determined that a significant effect exists due to income. The analyst wants to compare each Income to all others and wants to control for experimentwise error.
Which GLM procedure statement would provide the most appropriate output?
Which statistic, calculated from a validation sample, can help decide which model to use for prediction of a binary target variable?
Within PROC GLM, the interaction between the two categorical predictors, Income and Gender, was shown to be significant. An item store was saved from the GLM analysis.
Which statement from PROC PLM would test the significance of Gender within each level of Income and adjust for multiple tests?
Assume a $10 cost for soliciting a non-responder and a $200 profit for soliciting a responder. The logistic regression model gives a probability score named P_R on a SAS data set called VALID. The VALID data set contains the responder variable Pinch, a 1/0 variable coded as 1 for responder. Customers will be solicited when their probability score is more than 0.05.
Which SAS program computes the profit for each customer in the data set VALID?
Refer to the exhibit:
On the Gains Chart, what is the correct interpretation of the horizontal reference line?
The question will ask you to provide a missing statement. Given the following SAS program:
Which SAS statement will complete the program to correctly score the data set NEW_DATA?
Refer to the REG procedure output:
An analyst has selected this model as a champion because it shows better model fit than a competing model with more predictors.
Which statistic justifies this rationale?
Refer to the exhibit.
These graphs were created using the GLM procedure with the plots(only)=diagnostics option.
Which plot do you use to identify influential observations?
Refer to the exhibit:
The plots represent two models, A and B, being fit to the same two data sets, training and validation.
Model A is 90.5% accurate at distinguishing blue from red on the training data and 75.5% accurate at doing the same on validation data. Model B is 83% accurate at distinguishing blue from red on the training data and 78.3% accurate at doing the same on the validation data.
Which of the two models should be selected and why?
Refer to the confusion matrix:
An analyst determines that loan defaults occur at the rate of 3% in the overall population. The above confusion matrix is from an oversampled test set (1 = default).
What is the sensitivity adjusted for the population event probability?
Enter your answer in the space below. Round to three decimals (example: n.nnn).
This question will ask you to provide a missing option.
Complete the following syntax to test the homogeneity of variance assumption in the GLM procedure:
means Region /
Refer to the REG procedure output:
Click on the calculator button to display a calculator if needed.
Which characteristic of Studentized residuals indicate potential outliers?
Refer to the exhibit:
An analyst examined logistic regression models for predicting whether a customer would make a purchase. The ROC curve displayed summarizes the models. Using the selected model and the analyst's decision rule, 25% of the customers who did not make a purchase are incorrectly classified as purchasers.
What can be concluded from the graph?