Intel OpenCL Implicit Vectorizer
OpenCL compiler differs depending on its vendor, and Intel optimizes its compiler to auto-vectorize some loops that may take the advantage of the SSE and AVX instructions.
For example, the Black-Scholes equation when executed with single thread C99 and single thread OpenCL thread gives the execution time as below:
- Input: 10MB of data
- calculates both call and put option
- both uses -O3 compiler option of gcc-4.4
c99 : 1612.203 ms
OpenCL : 673.248 ms
This 'hidden' optimization is kinda cool, isn't it?
Written by yuriardila
Related protips
Have a fresh tip? Share with Coderwall community!
Post
Post a tip
Best
#Opencl
Authors
Sponsored by #native_company# — Learn More
#native_title#
#native_desc#