Share
Show that: ReLU h β1+λ1 ·Ω1ReLU [β0+λ0 · Ω0x] i =λ0λ1 · ReLU 1 λ0λ1 β1+Ω1ReLU 1 λ0 β0+Ω0x , where λ0 and λ1 are non-negative scalars, Ωk is the weights matrix that are applied to the k-th layer (contribute to the (k + 1)-th layer) and βk is the the vector of biases that contribute to hidden layer k + 1. From this, we see that the weight matrices can be rescaled by any magnitude as long as the biases are also adjusted, and the scale factors can be re-applied at the end of the network. Hint: Use non-negative homogeneity property of the ReLU function.
ReportQuestion
Please briefly explain why you feel this question should be reported.
2. Show that:
ReLU
h
β1+λ1 ·Ω1ReLU [β0+λ0 · Ω0x]
i
=λ0λ1 · ReLU
1
λ0λ1
β1+Ω1ReLU
1
λ0
β0+Ω0x
,
where λ0 and λ1 are non-negative scalars, Ωk is the weights matrix that are applied to the k-th layer (contribute
to the (k + 1)-th layer) and βk is the the vector of biases that contribute to hidden layer k + 1. From this, we
see that the weight matrices can be rescaled by any magnitude as long as the biases are also adjusted, and the
scale factors can be re-applied at the end of the network.
Hint: Use non-negative homogeneity property of the ReLU function.
Leave an answer