Hello,
The symptoms you have described seems like after the initial time the results are returned form some internal cache memory. To me that explains the difference between 45 second the initial time and 2 second the second time.
As for the second question, I think if there is a way to make MDX queries run in parallel it would have to be at lower level than BPC. For example, it might be possible at OLAP BAPI execution level. However, having said that I doubt that a single MDX query can be executed in parallel.
Best Regards,
Leila