set.seed(2025)
## your code hereProblem Set 9
Instructions
- Limited packages: Use only base R functions and the
dslabspackage - No for-loops: Use vectorized operations and built-in functions (functions like
applyare acceptable except where specifically noted)
- Create a 100×10 matrix of randomly generated normal numbers. Store the result in
x.
- Apply the three R functions that return: (a) the dimensions of
x, (b) the number of rows ofx, and (c) the number of columns ofx.
## your code here- Add row-specific scalars to matrix
x: add 1 to row 1, add 2 to row 2, and so on.
## your code here- Add column-specific scalars to matrix
x: add 1 to column 1, add 2 to column 2, and so on. Hint: UsesweepwithFUN = "+".
## your code here- Compute the mean of each row of
x.
## your code here- Compute the mean of each column of
x.
## your code here- Load the MNIST training data using
dslabs. For each digit class (0-9), compute the proportion of pixels that are in the “grey area” (pixel values between 50 and 205, inclusive). Create a boxplot showing these proportions by digit class. Hint: Use logical operators androwMeans.
## your code here- Use the
solvefunction to find the solution to this system of linear equations:
\[ \begin{align} x + 2y - 2z &= -15\\ 2x + y - 5z &= -21\\ x - 4y + z &= 18 \end{align} \]
## your code here- Use matrix multiplication to compute the mean of each column of
xand store the result as a single-row matrix. Hint: Create a 1×n matrix of weights (1/n, 1/n, …, 1/n) where n =nrow(x).
## your code here- Use matrix multiplication and other matrix operations to compute the standard deviation of each column of
x. Do not usesweep,apply, or any built-in standard deviation functions. Hint: Recall that \(\text{sd} = \sqrt{\frac{1}{n-1}\sum(x_i - \bar{x})^2}\).
## your code here- Load the MNIST training data and create a matrix
small_xcontaining only the first 100 observations. Use matrix multiplication to compute the correlation matrix between all pairs of these 100 observations. Hint: First standardize the rows, then use%*%with the transpose.
## your code here- Using the matrix
xfrom problem 1, create a new matrix where each entry is replaced by 1 if the original value is above the row mean, and 0 otherwise. Do this using only matrix operations and logical indexing (no loops or apply functions).
## your code here- Compute the Euclidean distance between row 1 and row 50 of matrix
xusing matrix multiplication. Store the result in a variable calleddistance. Hint: Usecrossprod()or matrix multiplication with transpose.
## your code here- Create a 5×5 identity matrix using only the
matrix()function anddiag(). Store it in a variable calledI5.
## your code here- Using the MNIST data, find which pixel position (row and column in the 28×28 image grid) has the highest average intensity across all training images. Return both the row and column indices. Hint: Use
which.max()andarrayInd().
## your code here- Using the MNIST training data, take the first 100 observations and reshape them into a 100×28×28 multidimensional array. Then compute the per-pixel averages across the first dimension to get a 28×28 matrix of average pixel intensities. Note that it is recommended to
applyhere is needed for multidimensional arrays. You can visualize this average image usingimage(1:28, 1:28, pixel_averages[, 28:1])to see what the “average” digit looks like across these 100 observations.
## your code here- Select the first 20 observations labeled as “1” and the first 20 observations labeled as “0” from the MNIST training data. Compute three correlation matrices: (a) correlations within the 1’s group, (b) correlations within the 0’s group, and (c) correlations between the 1’s and 0’s groups. Compare the average within-group correlation to the average between-group correlation.
## your code here