What is the bias-variance decomposition of expected test error? Write the formula and briefly explain each term. Section 4: Machine Learning Basics (15%) Question 10: Why does L2 regularization produce smaller but non-zero weights, while L1 regularization can produce exact zeros?

def func(x, lst=[]): lst.append(x) return lst print(func(1)) print(func(2)) Write a Python function normalize(X) that takes a 2D numpy array X (samples × features) and returns a zero-mean, unit-variance normalized version (feature-wise). Do not use sklearn .

For ( f(x) = \frac12 |Ax - b|^2 ), derive the closed-form solution for ( x ) that minimizes ( f ). Section 3: Probability & Statistics (20%) Question 7: You have two dice: one fair, one loaded (shows 6 with probability ( 1/3 ), others each ( 2/15 )). You pick a die at random (50% chance each), roll it once, and get a 6. What is the probability it was the loaded die? mbzuai entry exam sample questions

Consider ( f(x) = x^3 - 2x + 1 ). Perform one iteration of gradient descent starting at ( x_0 = 1 ) with learning rate ( \eta = 0.1 ). What is ( x_1 )?

Given vectors ( u = (1, 2, -1) ), ( v = (0, 1, 3) ). Compute the projection of ( u ) onto ( v ). What is the bias-variance decomposition of expected test

Briefly explain how backpropagation computes gradients in a neural network. Why is the chain rule essential? Section 5: Python & Coding Logic (10%) Question 13: What is the output of the following?

If ( A ) and ( B ) are square invertible matrices, then ( (A + B)^-1 = A^-1 + B^-1 ). Explain briefly. Section 2: Calculus & Optimization (25%) Question 4: Find the gradient ( \nabla f(x,y) ) of ( f(x,y) = \ln(1 + e^xy) ). Then compute the directional derivative at ( (1,0) ) in the direction of ( (1,1) ). def func(x, lst=[]): lst

Let ( X \sim \mathcalN(0,1) ). Compute ( \mathbbE[e^X] ). (Hint: MGF of normal)

Mbzuai Entry Exam Sample Questions Review

Consider ( f(x) = x^3 - 2x + 1 ). Perform one iteration of gradient descent starting at ( x_0 = 1 ) with learning rate ( \eta = 0.1 ). What is ( x_1 )?

Given vectors ( u = (1, 2, -1) ), ( v = (0, 1, 3) ). Compute the projection of ( u ) onto ( v ).

Briefly explain how backpropagation computes gradients in a neural network. Why is the chain rule essential? Section 5: Python & Coding Logic (10%) Question 13: What is the output of the following?

Let ( X \sim \mathcalN(0,1) ). Compute ( \mathbbE[e^X] ). (Hint: MGF of normal)