stdlib Define operators for transpose and matrix multiplication?

The Matlab equivalent of matmul(transpose(A),A) is A'*A, which is closer to math notation. Should stdlib define operators for transpose and matmul so that one can write something like (.t. A) .x. A? A simple implementation for default reals is below. Please excuse me if this has been proposed before.

module op_mod
implicit none
interface operator(.t.)
   module procedure my_trans
end interface
interface operator(.x.)
   module procedure my_mult
end interface
contains
pure function my_trans(x) result(xt)
real, intent(in) :: x(:,:)
real             :: xt(size(x,2),size(x,1))
xt = transpose(x)
end function my_trans
! 
pure function my_mult(x,y) result(xy)
real, intent(in) :: x(:,:),y(:,:)
real             :: xy(size(x,1),size(y,2))
xy = matmul(x,y)
end function my_mult
end module op_mod
!
program test_op
use op_mod
implicit none
integer, parameter :: m = 5, n = 7, p = 100
real :: a(m,n)
real, allocatable :: a1(:,:), a2(:,:)
call random_number(a)
a1 = matmul(transpose(a),a)
a2 = (.t.a) .x. a
print*,shape(a1),shape(a2)
print*,sum(a1**2),sum((a1-a2)**2),maxval(a1-a2)
end program test_op

with output

           7           7           7           7
   85.2432175       3.55271368E-15   0.00000000

Jan 10 '22 21:01 Beliavsky

I like the idea of something like this, but having implemented something nearly identical in my code before, I tend to get a significant performance hit with the interface operators for stuff like matrix multiplication.

Apr 07 '22 18:04 trippalamb

@trippalamb you get a performance hit compared to what? Is it using matmul(transpose(A),A) or something else? I don't believe the compiler gets smart enough to optimize it directly, at least in my humble tests. I don't see an issue with having a few syntactic sugars like this if the performance hit is just comparable to using the default functions but shorter (In heavy lifting codes you should be using BLAS anyway).

What could be done is implement an entire derived type for matrices, having a few fields (like: is_transposed, is_conjugated), and when the DT is doing an operation like a multiplication will check such fields and call an optimized subroutine. There will be a small overhead compared to what is the best but the syntax will be (arguably) friendly and shorter.

But this can get really tiresome with current Fortran capabilities, for instance, accessing data would be a little annoying since we can't overload (:,:) [I don't know the implications], we could do some atrocities like implementing an overload to operator(//) where rhs is the DT and the lhs is a "slice" :wink:, like this M//[1, 2], and some stuff to help to define ranges... Oh well! Anyway, I should stop here.

May 05 '22 14:05 14NGiestas