# Your code here
Problem set 2
For these exercises, do not load any packages other than dslabs.
Make sure to use vectorization whenever possible.
- What is the sum of the first 100 positive integers? Use the functions
seq
andsum
to compute the sum with R for anyn
.
- Load the US murders dataset from the dslabs package. Use the function
str
to examine the structure of themurders
object. What are the column names used by the data frame for these five variables? Show the subset ofmurders
showing states with less than 1 per 100,000 deaths. Show all variables.
library(dslabs)
str(murders)
'data.frame': 51 obs. of 5 variables:
$ state : chr "Alabama" "Alaska" "Arizona" "Arkansas" ...
$ abb : chr "AL" "AK" "AZ" "AR" ...
$ region : Factor w/ 4 levels "Northeast","South",..: 2 4 4 2 4 4 1 2 2 2 ...
$ population: num 4779736 710231 6392017 2915918 37253956 ...
$ total : num 135 19 232 93 1257 ...
# Your code here
- Show the subset of
murders
showing states with less than 1 per 100,000 deaths and in the West of the US. Don’t show theregion
variable.
# Your code here
- Show the largest state with a rate less than 1 per 100,000.
# Your code here
- Show the state with a population of more than 10 million with the lowest rate.
# Your code here
- Compute the rate for each region of the US.
# Your code here
- Create a vector of numbers that starts at 6, does not pass 55, and adds numbers in increments of 4/7: 6, 6 + 4/7, 6 + 8/7, and so on. How many numbers does the list have? Hint: use
seq
andlength
.
# Your code here
- Make this data frame:
<- c(35, 88, 42, 84, 81, 30)
temp <- c("Beijing", "Lagos", "Paris", "Rio de Janeiro",
city "San Juan", "Toronto")
<- data.frame(name = city, temperature = temp) city_temps
Convert the temperatures to Celsius.
# Your code here
- Write a function
euler
that compute the following sum for any \(n\):
\[ S_n = 1+1/2^2 + 1/3^2 + \dots 1/n^2 \]
# Your code here
- Show that as \(n\) gets bigger we get closer \(\pi^2/6\) by plotting \(S_n\) versus \(n\) with a horizontal dashed line at \(\pi^2/6\).
# Your code here
- Use the
%in%
operator and the predefined objectstate.abb
to create a logical vector that answers the question: which of the following are actual abbreviations: MA, ME, MI, MO, MU?
# Your code here
- Extend the code you used in the previous exercise to report the one entry that is not an actual abbreviation. Hint: use the
!
operator, which turnsFALSE
intoTRUE
and vice versa, thenwhich
to obtain an index.
# Your code here
- In the
murders
dataset, use%in%
to show all variables for New York, California, and Texas, in that order.
# Your code here
- Write a function called
vandermonde_helper
that for any \(x\) and \(n\), returns the vector \((1, x, x^2, x^3, \dots, x^n)\). Show the results for \(x=3\) and \(n=5\).
# Your code here
- Create a vector using:
<- 10000
n <- 0.5
p set.seed(2024-9-6)
<- sample(c(0,1), n, prob = c(1 - p, p), replace = TRUE) x
Compute the length of each stretch of 1s and then plot the distribution of these values. Check to see if the distribution follows a geometric distribution as the theory predicts. Do not use a loop!
# Your code here