What are the advantages of the mutual information over the Pearson correlation for text classification problems?
A problem statement is given as below
Hospital records show that of patients suffering from a certain disease, 75% die of it. What is the probability that of 6 randomly selected patients, 4 will recover?
Which of the following model will you use to solve it.
A fruit may be considered to be an apple if it is red, round, and about 3" in diameter. A naive Bayes classifier considers each of these features to contribute independently to the probability that this fruit is an apple, regardless of the
Select the statement which applies correctly to the Naive Bayes
Marie is getting married tomorrow, at an outdoor ceremony in the desert. In recent years, it has
rained only 5 days each year. Unfortunately, the weatherman has predicted rain for tomorrow. When it actually rains, the weatherman correctly forecasts rain 90% of the time. When it doesn't rain, he incorrectly forecasts rain 10% of the time. Which of the following will you use to calculate the probability whether it will rain on the
day of Marie’s wedding?
You are working on a Data Science project and during the project you have been gibe a responsibility to interview all the stakeholders in the project. In which phase of the project you are?
Which analytical method is considered unsupervised?
may have a trend component that is quadratic in nature. Which pattern of data will indicate that the trend in the time series data is quadratic in nature?
What is one modeling or descriptive statistical function in MADlib that is typically not provided in a standard relational database?
Which of the following is a correct example of the target variable in regression (supervised learning)?
Select the correct option from the below
A researcher is interested in how variables, such as GRE (Graduate Record Exam scores), GPA (grade point average) and prestige of the undergraduate institution, effect admission into graduate school. The response variable, admit/don't admit, is a binary variable.
Above is an example of
Of all the smokers in a particular district, 40% prefer brand A and 60% prefer brand B. Of those smokers who prefer brand A. 30% are females, and of those who prefer brand B. 40% are female. What is the probability that a randomly selected smoker prefers brand A, given that the person selected is a female?
Which of the following is a best way to solve this problem?
In which lifecycle stage are test and training data sets created?
You are creating a regression model with the input income, education and current debt of a customer, what could be the possible output from this model.
Which of the following are point estimation methods?
What is the probability that the total of two dice will be greater than 8, given that the first die is a 6?
Assume some output variable "y" is a linear combination of some independent input variables "A" plus some independent noise "e". The way the independent variables are combined is defined by a parameter vector B y=AB+e where X is an m x n matrix. B is a vector of n unknowns, and b is a vector of m values. Assuming that m is not equal to n and the columns of X are linearly independent, which expression correctly solves for B?
You are asked to create a model to predict the total number of monthly subscribers for a specific magazine. You are provided with 1 year's worth of subscription and payment data, user demographic data, and 10 years worth of content of the magazine (articles and pictures). Which algorithm is the most appropriate for building a predictive model for subscribers?
In which of the scenario you can use the regression to predict the values
Select the correct statement which applies to Supervised learning