Top Data Science Interview Questions of R(basic)

Basic R Interview Questions

1. Compare R & Python

R programming Language Python programming language
Model Building is similar to Python Model Building is similar to R.
Model Interpretability is good Model Interpretability is not good
Production is not better than Python. Production is good
R has good community support over Python. Community Support is not better than R
Data Science Libraries are same as Python. Data Science Libraries are same as R.
R has good data visualizations libraries and tools Data visualization is not better than R
R has a steep learning curve. Learning Curve in Python is easier than learning R.

2. Explain the data import in R language.

R provides to import data in R language. To begin with the R commander GUI, user should type the commands in the command Rcmdr into the console. Data can be imported in R language in 3 ways such as:

  • Select the data set in the dialog box or enter the name of the data set as required.
  • Data is entered directly using the editor of R Commander via Data->New Data Set. This works good only when the data set is not too large.
  • Data can also be imported from a URL or from plain text file (ASCII), or from any statistical package or from the clipboard.

3. Explain how to communicate the outputs of data analysis using R language.

Combine the data, code and analysis results in a single document using knitr for Reproducible research done. Helps to verify the findings, add to them and engage in conversations. Reproducible research makes it easy to redo the experiments by inserting new data values and applying it to different various problems.

4. Difference between library () and require () functions in R language.

library() require()
Library () function gives an error message display, if the desired package cannot be loaded. Require () function is used inside function and throws a warning messages whenever a particular package is not Found
It loads the packages whether it is already loaded or not, It just checks that it is loaded, or loads it if it isn’t (use in functions that rely on a certain package). The documentation explicitly states that neither function will reload an already loaded package.

Consider a related program for the above differentiation.

if(!require(package, character.only=T, quietly=T)) { install.packages (package) library(package, character.only=T) }

For multiple packages you can use

for(package in c(’’, ‘’)) { if(!require(package, character.only=T, quietly=T)) { install.packages (package) library(package, character.only=T) } }

5. What is R?

R is a programming language which is used for developing statistical software and data analysis. It is being increasingly deployed for machine learning applications as well.

6. How R commands are written?

By using # at the starting of the line of code like #division commands are written.

7. What is t-tests() in R?

It is used to determine that the means of two groups are equal or not by using t.test() function.

8. What are the disadvantages of R Programming?

The disadvantages are:-

  • Lack of standard GUI
  • Not good for big data.
  • Does not provide spreadsheet view of data.

9. What is the use of With () and By () function in R?

with() function applies an expression to a dataset.

#with(data,expression)

By() function applies a function t each level of a factors.

#by(data,factorlist,function)

10. In R programming, how missing values are represented?

In R missing values are represented by NA which should be in capital letters.

11. What is the use of subset() and sample() function in R?

Subset() is used to select the variables and observations and sample() function is used to generate a random sample of the size n from a dataset.

12. Explain what is transpose.

Transpose is used for reshaping of the data which is used for analysis. Transpose is performed by t() function. Here is an example of data in the form of matrix and its transpose:

13. What are the advantages of R?

  • The advantages are:-
  • It is used for managing and manipulating of data.
  • No license restrictions
  • Free and open source software.
  • Graphical capabilities of R are good.
  • Runs on many Operating system and different hardware and also run on 32 & 64 bit processors etc.

Now that you are aware of the benefits of R programming, to know more check out R Course.

14. What is the function used for adding datasets in R?

For adding two datasets rbind() function is used but the column of two datasets must be same.

Syntax: rbind(x1,x2……) where x1,x2: vector, matrix, data frames.

15. How you can produce co-relations and covariances?

Cor-relations is produced by cor() and covariances is produced by cov() function.

16. What is difference between matrix and dataframes?

Dataframe can contain different type of data but matrix can contain only similar type of data. Here are the different types of data structures in R:

17. What is difference between lapply and sapply?

lapply is used to show the output in the form of list whereas sapply is used to show the output in the form of vector or data frame

18. What is the difference between seq(4) and seq_along(4)?

Seq(4) means vector from 1 to 4 (c(1,2,3,4)) whereas seq_along(4) means a vector of the length(4) or 1(c(1)).

19. Explain how you can start the R commander GUI.

rcmdr command is used to start the R commander GUI.

20. What is the memory limit of R?

In 32 bit system memory limit is 3Gb but most versions limited to 2Gb and in 64 bit system memory limit is 8Tb.

Check out this Data Science Interview Questions video by Intellipaat:

21. How many data structures R has?

There are 5 data structure in R i.e. vector, matrix, array which are of homogenous type and other two are list and data frame which are heterogeneous.

Learn more about data structure in R programming tutorial.

22. Explain how data is aggregated in R.

There are two methods that is collapsing data by using one or more BY variable and other is aggregate() function in which BY variable should be in list.

23. How many sorting algorithms are available?

There are 5 types of sorting algorithms are used which are:-

  • Bubble Sort
  • Selection Sort
  • Merge Sort
  • Quick Sort
  • Bucket Sort

24. How to create new variable in R programming?

For creating new variable assignment operator ‘<-’ is used For e.g. mydata$sum <- mydata$x1 + mydata$x2

25. What are R packages?

Packages are the collections of data, R functions and compiled code in a well-defined format and these packages are stored in library. One of the strengths of R is the user-written function in R language.

26. What is the workspace in R?

Workspace is the current R working environment which includes any user defined objects like vector, lists etc.

27. What is the function which is used for merging of data frames horizontally in R?

Merge()function is used to merge two data frames

Eg. Sum<-merge(data frame1,data frame 2,by=’ID’)

28. what is the function which is used for merging of data frames vertically in R?

rbind() function is used to merge two data frames vertically.

Eg. Sum <- rbind(data frame1,data frame 2)

29. What is the power analysis?

It is used for experimental design .It is used to determine the effect of given sample size.

30. Which package is used for power analysis in R?

Pwr package is used for power analysis in R.