607 + Mcqs in Machine Learning Page 6 McqOptions

251.	Naive Bayes classifiers is Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Learning
A.	supervised
B.	unsupervised
C.	both
D.	none
Answer» B. unsupervised

Discussion

252.	Naive Bayes classifiers are a collection ------------------of algorithms
A.	classification
B.	clustering
C.	regression
D.	all
Answer» B. clustering

Discussion

253.	Suppose you plotted a scatter plot between the residuals and predicted values in linear regression and you found that there is a relationship between them. Which of the following conclusion do you make about this situation?
A.	since the there is a relationship means our model is not good
B.	since the there is a relationship means our model is good
C.	cant say
D.	none of these
Answer» B. since the there is a relationship means our model is good

Discussion

254.	Which of the following statement is true about outliers in Linear regression?
A.	linear regression is sensitive to outliers
B.	linear regression is not sensitive to outliers
C.	cant say
D.	none of these
Answer» B. linear regression is not sensitive to outliers

Discussion

255.	Overfitting is more likely when you have huge amount of data to train?
A.	true
B.	false
Answer» C.

Discussion

256.	Which of the following is true about Residuals ?
A.	lower is better
B.	higher is better
C.	a or b depend on the situation
D.	none of these
Answer» B. higher is better

Discussion

257.	Which of the following methods do we use to find the best fit line for data in Linear Regression?
A.	least square error
B.	maximum likelihood
C.	logarithmic loss
D.	both a and b
Answer» B. maximum likelihood

Discussion

258.	It is possible to design a Linear regression algorithm using a neural network?
A.	true
B.	false
Answer» B. false

Discussion

259.	Linear Regression is a supervised machine learning algorithm.
A.	true
B.	false
Answer» B. false

Discussion

260.	In the mathematical Equation of Linear Regression Y?=??1 + ?2X + ?, (?1, ?2) refers to
A.	(x-intercept, slope)
B.	(slope, x-intercept)
C.	(y-intercept, slope)
D.	(slope, y-intercept)
Answer» D. (slope, y-intercept)

Discussion

261.	In syntax of linear model lm(formula,data,..), data refers to
A.	matrix
B.	vector
C.	array
D.	list
Answer» C. array

Discussion

262.	Function used for linear regression in R is
A.	lm(formula, data)
B.	lr(formula, data)
C.	lrm(formula, data)
D.	regression.linear(formula, data)
Answer» B. lr(formula, data)

Discussion

263.	In a simple linear regression model (One independent variable), If we change the input variable by 1 unit. How much output variable will change?
A.	by 1
B.	no change
C.	by intercept
D.	by its slope
Answer» E.

Discussion

264.	How many coefficients do you need to estimate in a simple linear regression model (One independent variable)?
A.	1
B.	2
C.	3
D.	4
Answer» C. 3

Discussion

265.	Which of the following metrics can be used for evaluating regression models?i) R Squaredii) Adjusted R Squarediii) F Statisticsiv) RMSE / MSE / MAE
A.	ii and iv
B.	i and ii
C.	ii, iii and iv
D.	i, ii, iii and iv
Answer» E.

Discussion

266.	If Linear regression model perfectly first i.e., train error is zero, then
A.	test error is also always zero
B.	test error is non zero
C.	couldnt comment on test error
D.	test error is equal to train error
Answer» D. test error is equal to train error

Discussion

267.	Â Â Â Â Â Â Â Â Â Â Â Â Â adopts a dictionary-oriented approach, associating to each category label a progressive integer number.
A.	labelencoder class
B.	labelbinarizer class
C.	dictvectorizer
D.	featurehasher
Answer» B. labelbinarizer class

Discussion

268.	In many classification problems, the target Â Â Â Â Â Â Â Â Â Â Â Â is made up of categorical labels which cannot immediately be processed by any algorithm.
A.	random_state
B.	dataset
C.	test_size
D.	all above
Answer» C. test_size

Discussion

269.	The parameterÂ Â Â Â Â Â Â Â Â Â Â Â allows specifying the percentage of elements to put into the test/training set
A.	test_size
B.	training_size
C.	all above
D.	none of these
Answer» D. none of these

Discussion

270.	What are common feature selection methods in regression task?
A.	correlation coefficient
B.	greedy algorithms
C.	all above
D.	none of these
Answer» D. none of these

Discussion

271.	Can a model trained for item based similarity also choose from a given set of items?
A.	yes
B.	no
Answer» B. no

Discussion

272.	What is PCA, KPCA and ICA used for?
A.	principal components analysis
B.	kernel based principal component analysis
C.	independent component analysis
D.	all above
Answer» E.

Discussion

273.	What would you do in PCA to get the same projection as SVD?
A.	transform data to zero mean
B.	transform data to zero median
C.	not possible
D.	none of these
Answer» B. transform data to zero median

Discussion

274.	A feature F1 can take certain value: A, B, C, D, E, & F and represents grade of students from a college. Which of the following statement is true in following case?
A.	feature f1 is an example of nominal variable.
B.	feature f1 is an example of ordinal variable.
C.	it doesnt belong to any of the above category.
D.	both of these
Answer» C. it doesnt belong to any of the above category.

Discussion

275.	Â Â Â Â Â Â Â Â Â Â Â Â Â Â performs a PCA with non-linearly separable data sets.
A.	sparsepca
B.	kernelpca
C.	svd
D.	none of the mentioned
Answer» C. svd

Discussion

276.	Which of the following selects only a subset of features belonging to a certain percentile
A.	selectpercentile
B.	featurehasher
C.	selectkbest
D.	all above
Answer» B. featurehasher

Discussion

277.	There are also many univariate methods that can be used in order to select the best features according to specific criteria based onÂ Â Â Â Â Â Â Â Â Â Â Â Â Â .
A.	f-tests and p-values
B.	chi-square
C.	anova
D.	all above
Answer» B. chi-square

Discussion

278.	scikit-learn also provides a class for per- sample normalization, Normalizer. It can applyÂ Â Â Â Â Â Â Â Â Â Â Â Â Â to each element of a dataset
A.	max, l0 and l1 norms
B.	max, l1 and l2 norms
C.	max, l2 and l3 norms
D.	max, l3 and l4 norms
Answer» C. max, l2 and l3 norms

Discussion

279.	If you need a more powerful scaling feature, with a superior control on outliers and the possibility to select a quantile range, there's also the classÂ Â Â Â Â Â Â Â Â Â Â Â Â Â .
A.	robustscaler
B.	dictvectorizer
C.	labelbinarizer
D.	featurehasher
Answer» B. dictvectorizer

Discussion

280.	How it's possible to use a different placeholder through the parameterÂ Â Â Â Â Â Â Â Â Â Â Â Â .
A.	regression
B.	classification
C.	random_state
D.	missing_values
Answer» E.

Discussion

281.	Â Â Â Â Â Â Â Â Â Â Â Â Â Â is much more difficult because it's necessary to determine a supervised strategy to train a model for each feature and, finally, to predict their value
A.	removing the whole line
B.	creating sub-model to predict those features
C.	using an automatic strategy to input them according to the other known values
D.	all above
Answer» C. using an automatic strategy to input them according to the other known values

Discussion

282.	What is Test set?
A.	test set is used to test the accuracy of the hypotheses generated by the learner.
B.	it is a set of data is used to discover the potentially predictive relationship.
C.	both a & b
D.	none of above
Answer» B. it is a set of data is used to discover the potentially predictive relationship.

Discussion

283.	What is Overfitting in Machine learning?
A.	when a statistical model describes random error or noise instead of underlying relationship overfitting occurs.
B.	robots are programed so that they can perform the task based on data they gather from sensors.
C.	while involving the process of learning overfitting occurs.
D.	a set of data is used to discover the potentially predictive relationship
Answer» B. robots are programed so that they can perform the task based on data they gather from sensors.

Discussion

284.	Which of the following sentence is correct?
A.	machine learning relates with the study, design and
B.	data mining can be defined as the process in which the
C.	both a & b
D.	none of the above
Answer» D. none of the above

Discussion

285.	During the last few years, many Â Â Â Â Â Â Â Â Â Â Â Â algorithms have been applied to deep neural networks to learn the best policy for playing Atari video games and to teach an agent how to associate the right action with an input representing the state.
A.	logical
B.	classical
C.	classification
D.	none of above
Answer» E.

Discussion

286.	Which of the following are supervised learning applications
A.	spam detection, pattern detection, natural language processing
B.	image classification, real-time visual tracking
C.	autonomous car driving, logistic optimization
D.	bioinformatics, speech recognition
Answer» B. image classification, real-time visual tracking

Discussion

287.	if there is only a discrete number of possible outcomes (called categories), the process becomes aÂ Â Â Â Â Â Â Â Â Â Â .
A.	regression
B.	classification.
C.	modelfree
D.	categories
Answer» C. modelfree

Discussion

288.	Reinforcement learning is particularly efficient whenÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â .
A.	the environment is not completely deterministic
B.	it\s often very dynamic
C.	it\s impossible to have a precise error measure
D.	all above
Answer» E.

Discussion

289.	Commons unsupervised applications include
A.	object segmentation
B.	similarity detection
C.	automatic labeling
D.	all above
Answer» E.

Discussion

290.	what is the function of Supervised Learning?
A.	classifications, predict time series, annotate strings
B.	speech recognition, regression
C.	both a & b
D.	none of above
Answer» D. none of above

Discussion

291.	Common deep learning applications include
A.	image classification, real-time visual tracking
B.	autonomous car driving, logistic optimization
C.	bioinformatics, speech recognition
D.	all above
Answer» E.

Discussion

292.	What is Training set?
A.	training set is used to test the accuracy of the hypotheses generated by the learner.
B.	a set of data is used to discover the potentially predictive relationship.
C.	both a & b
D.	none of above
Answer» C. both a & b

Discussion

293.	What are the popular algorithms of Machine Learning?
A.	decision trees and neural networks (back propagation)
B.	probabilistic networks and nearest neighbor
C.	support vector machines
D.	all
Answer» E.

Discussion

294.	How can you avoid overfitting ?
A.	by using a lot of data
B.	by using inductive machine learning
C.	by using validation only
D.	none of above
Answer» B. by using inductive machine learning

Discussion

295.	According toÂ Â Â Â Â Â Â Â , its a key success factor for the survival and evolution of all species.
A.	claude shannon\s theory
B.	gini index
C.	darwins theory
D.	none of above
Answer» D. none of above

Discussion

296.	Suppose you are given â€˜nâ€™ predictions on test data by â€˜nâ€™ different models (M1, M2, â€¦. Mn) respectively. Which of the following method(s) can be used to combine the predictions of these models? Note: We are working on a regression problem 1. Median 2. Product 3. Average 4. Weighted sum 5. Minimum and Maximum 6. Generalized mean rule
A.	1, 3 and 4
B.	1,3 and 6
C.	1,3, 4 and 6
D.	all of above
Answer» E.

Discussion

297.	How can we assign the weights to output of different models in an ensemble? 1. Use an algorithm to return the optimal weights 2. Choose the weights using cross validation 3. Give high weights to more accurate models
A.	1 and 2
B.	1 and 3
C.	2 and 3
D.	all of above
Answer» E.

Discussion

298.	Which of the following is true about averaging ensemble?
A.	it can only be used in classification problem
B.	it can only be used in regression problem
C.	it can be used in both classification as well as regression
D.	none of these
Answer» D. none of these

Discussion

299.	Which of the following is true about weighted majority votes? 1. We want to give higher weights to better performing models 2. Inferior models can overrule the best model if collective weighted votes for inferior models is higher than best model 3. Voting is special case of weighted voting
A.	1 and 3
B.	2 and 3
C.	1 and 2
D.	1, 2 and 3
Answer» E.

Discussion

300.	Which of the following are correct statement(s) about stacking? A machine learning model is trained on predictions of multiple machine learning models A Logistic regression will definitely work better in the second stage as compared to other classification methods First stage models are trained on full / partial feature space of training data
A.	1 and 2
B.	2 and 3
C.	1 and 3
D.	all of above
Answer» D. all of above

Discussion

Explore topic-wise MCQs in Computer Science Engineering (CSE).

Naive Bayes classifiers is Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Learning

Naive Bayes classifiers are a collection ------------------of algorithms

Suppose you plotted a scatter plot between the residuals and predicted values in linear regression and you found that there is a relationship between them. Which of the following conclusion do you make about this situation?

Which of the following statement is true about outliers in Linear regression?

Overfitting is more likely when you have huge amount of data to train?

Which of the following is true about Residuals ?

Which of the following methods do we use to find the best fit line for data in Linear Regression?

It is possible to design a Linear regression algorithm using a neural network?

Linear Regression is a supervised machine learning algorithm.

In the mathematical Equation of Linear Regression Y?=??1 + ?2X + ?, (?1, ?2) refers to

In syntax of linear model lm(formula,data,..), data refers to

Function used for linear regression in R is

In a simple linear regression model (One independent variable), If we change the input variable by 1 unit. How much output variable will change?

How many coefficients do you need to estimate in a simple linear regression model (One independent variable)?

Which of the following metrics can be used for evaluating regression models?i) R Squaredii) Adjusted R Squarediii) F Statisticsiv) RMSE / MSE / MAE

If Linear regression model perfectly first i.e., train error is zero, then

Â Â Â Â Â Â Â Â Â Â Â Â Â adopts a dictionary-oriented approach, associating to each category label a progressive integer number.

In many classification problems, the target Â Â Â Â Â Â Â Â Â Â Â Â is made up of categorical labels which cannot immediately be processed by any algorithm.

The parameterÂ Â Â Â Â Â Â Â Â Â Â Â allows specifying the percentage of elements to put into the test/training set

What are common feature selection methods in regression task?

Can a model trained for item based similarity also choose from a given set of items?

What is PCA, KPCA and ICA used for?

What would you do in PCA to get the same projection as SVD?

A feature F1 can take certain value: A, B, C, D, E, & F and represents grade of students from a college. Which of the following statement is true in following case?

Â Â Â Â Â Â Â Â Â Â Â Â Â Â performs a PCA with non-linearly separable data sets.

Which of the following selects only a subset of features belonging to a certain percentile

There are also many univariate methods that can be used in order to select the best features according to specific criteria based onÂ Â Â Â Â Â Â Â Â Â Â Â Â Â .

scikit-learn also provides a class for per- sample normalization, Normalizer. It can applyÂ Â Â Â Â Â Â Â Â Â Â Â Â Â to each element of a dataset

If you need a more powerful scaling feature, with a superior control on outliers and the possibility to select a quantile range, there's also the classÂ Â Â Â Â Â Â Â Â Â Â Â Â Â .

How it's possible to use a different placeholder through the parameterÂ Â Â Â Â Â Â Â Â Â Â Â Â .

Â Â Â Â Â Â Â Â Â Â Â Â Â Â is much more difficult because it's necessary to determine a supervised strategy to train a model for each feature and, finally, to predict their value

What is Test set?

What is Overfitting in Machine learning?

Which of the following sentence is correct?

During the last few years, many Â Â Â Â Â Â Â Â Â Â Â Â algorithms have been applied to deep neural networks to learn the best policy for playing Atari video games and to teach an agent how to associate the right action with an input representing the state.

Which of the following are supervised learning applications

if there is only a discrete number of possible outcomes (called categories), the process becomes aÂ Â Â Â Â Â Â Â Â Â Â .

Reinforcement learning is particularly efficient whenÂ Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â .

Commons unsupervised applications include

what is the function of Supervised Learning?

Common deep learning applications include

What is Training set?

What are the popular algorithms of Machine Learning?

How can you avoid overfitting ?

According toÂ Â Â Â Â Â Â Â , its a key success factor for the survival and evolution of all species.

How can we assign the weights to output of different models in an ensemble? 1. Use an algorithm to return the optimal weights 2. Choose the weights using cross validation 3. Give high weights to more accurate models

Which of the following is true about averaging ensemble?

Which of the following is true about weighted majority votes? 1. We want to give higher weights to better performing models 2. Inferior models can overrule the best model if collective weighted votes for inferior models is higher than best model 3. Voting is special case of weighted voting