As it turns out, the Intel manual is correct in stating that you should use xchg instead of bswap. In practice, it's hard to say the result of this 16-bit bswap is 'undefined;' as it is consistent with what it does each time. Instead of swapping the contents of ah and al 8-bit registers within ax, it actually just clears the register to 0x00. I tried a lot of different values to test this, and it always zero's it out. I also tried fully loading up eax, and bswap'ing any value for ax just cleared the ax part, leaving the upper part of eax intact.
So in practice: bswap reg16 = xor reg16, reg16 = mov reg16, 0 (were both reg16's are the same register in the xor)