Correct Option: AExplanation: Data Analysis -> Pre-Processing -> Model Building -> Predict
Q: Which pre-processing technique is used to remove the most commonly used words?
A.Tokenization
B.Lemmatization
C.Stopword removal
Correct Option: CExplanation: Stopword removal
Q: The most widely used package for machine learning in Python is _________
A.bottle
B.jango
C.sklearn
D.pillow
Correct Option: CExplanation: sklearn
Q: Clustering is supervised classification.
A.True
B.False
Correct Option: BExplanation: False
Q: True Negative is when the predicted instance and the actual instance are positive.
A.True
B.False
Correct Option: BExplanation: False
Q:
a) Download the dataset from https://hrcdn.net/s3_pub/istreet-assets/H4_TQkbOj39HUNoBukluIQ/training.txt and load it to the variable 'sentiment_analysis_data'.
b) Give the column names as 'label' and 'message'.
c) Try out the code snippets and answer the questions.
Which of the following commands is used to view the dataset SIZE, and what is the value returned?
Q: The cross-validation technique is used to evaluate a classifier by dividing the data set into a training set to train the classifier and a testing set to test the same.
A. True
B.False
Correct Option: AExplanation: True
Q:
a) Download the dataset from https://hrcdn.net/s3_pub/istreet-assets/H4_TQkbOj39HUNoBukluIQ/training.txt and load it to the variable 'sentiment_analysis_data'.
b) Give the column names as 'label' and 'message'.
c) Try out the code snippets and answer the questions.
What is the output of the following command: print(sentiment_analysis_data['label'].unique())
A. [yes no]
B.[true false]
C.[1 0]
D.None of the options
Correct Option: CExplanation:
[1 0]
Q: Identify the unstructured data from the following.