I try to use Altera's floating point IP to generate half precision instead of single (32-bit) blocks for addition , multiplication etc. However when configuring the IP it seems that half precision fp needs a lot more LUTs and has far more latency in cycles than the 32-bit counterpart. This seems contrary to the assumption that half precision fp should be faster than single precision and occupy less circuit area....has anybody used altera IP half precision fp blocks? Fid you see any improvement compared to single precision?