Date: Thu, 30 Oct 1997 21:32:45 +0100 From: Swietanowski Artur <[log in to unmask]> Just a short question: What is in your opinion the proportion of calls to MATMUL in which tha sequence association is preserved to all calls in a typical code? I don't mean some examples concocted to confuse compilers and possibly compiler writers, but real applications. I have no idea. I'm actually a debugger person, not a compiler person, but I do work with our HPF group. Most of the codes I have seen are actually legacy/F77 based, and as such use sequence association because F77 doesn't provide otherwise. If the question is more than rhetorical, many other people will have to include their answers. And three comments: 2) If a compiler came with a decent documentation of what optimizations are possible for different input arguments, many application writers would try to avoid the low performance constructs. One possibility of such documentstion would be a compiler option that would cause the compiler to list the optimizations it cannot do and explain why. This might be desirable from a user's perspective, but is a bad idea for maintainability and portability. Instead of expressing the intent of your algorithm, you would be conforming to the peculiarities of a particular implementation. As somebody on this list said a few days ago, you could write a wrapper to MATMUL to do your own blocking, and all the other things you would want to the compiler to do, but the result would no longer look like the original loop doing MATMUL()s. 3) Some performance-conscious computer vendors provide custom versions of BLAS, which would take care of efficiently executing the MATMUL when the input data is suficciently regular. If you'd rely on BLAS to do the dirty work, you could save yourself the effort of further optimizing the 'special cases'. You lost me here. MATMUL is an F90 intrinsic. BLAS is a linear algebra library (which may have it's own MATMUL). If what you are saying is that if the inputs are not sequence associated but are regular, describe them in terms of BLAS objects, use BLAS to redistribute/pack the inputs, then run MATMUL, I suspect the compiler has enough knowledge of the inputs to do the redistribution/pack without resorting to BLAS. If you are saying something else, I'm not sure what it is. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%