we can think intuitively of this as moving the error backward through the network, giving us some sort of measure of the error at the output of the lth layer. We then take the Hadamard product ⊙σ′(zl). This moves the error backward through the activation function in layer l, giving us the error δl in the weighted input to layer l.