Yes, your thinking is correct (in theory), but this characteristic is purposeful and very welcome. This behavior happens because numpy applies a practice called Broadcasting. When you perform operations between arrays with a different amount of dimensions, the smaller one is 'replicated' as many times as needed to go through the larger one.
See the practice definition given in the numpy documentation:
The term Broadcasting describes how numpy Treats arrays with Different shapes During arithmetic Operations. Subject to Certain constraints, the smaller array is "broadcast" Across the Larger array so that they have compatible shapes. Broadcasting provides a Means of vectorizing array Operations so that looping occurs in C Instead of Python. It does this without making Needless copies of data and usually leads to Efficient Algorithm implementations.
Thus, if you add a scalar to a vector or matrix, the scalar will be added to each element of the vector or matrix. If you add a vector to a matrix, the vector will be added to each column of the matrix (if the dimensions are compatible).
This link has a clearer graphical explanation, if you still have doubts.