[mvapich-discuss] problem w/MVAPICH in the frames of Gen1

Mikhail Kuzminsky kus at free.net
Fri Aug 11 14:21:10 EDT 2006


In message from Sayantan Sur <surs at cse.ohio-state.edu> (Thu, 10 Aug 
2006 14:05:41 -0500):
<skipped>
>Thanks for verifying all the details and posting the strace output. 
>Yes, it seems that the problem stems from something else with system 
>and VAPI libraries than just MVAPICH.
>Observing this run for `perf_main' with strace,
>> strace on server part (strace perf_main -trc ...)           
>> -------------------------------------------------------------
>[...]
>> ioctl(6, 0x80287801, 0x7fbfffe800)      = 0
>> mlock(0x627000, 2150135809)             = -1 EPERM (Operation not 
>> permitted)
>> write(1, "Error: Allocating PD : Invalid V"..., 47Error: Allocating 
>>PD 
>> : Invalid Virtual Address) = 47
>> --- SIGSEGV (Segmentation fault) @ 0 (0) ---
>> +++ killed by SIGSEGV +++
>I am particularly disturbed noticing that the mlock() call requires 
>locking 2150135809 bytes of memory. That's almost 2GB! Going back a 
>little in this thread, it seems that you have given max lockable 
>memory to 1.95GB memory with ssh and only 0.8GB with rsh. Ofcourse, 
>the mlock() will fail :-)
     Dear Sayantan !

1) About mem amounts: 2150135809 itself is higher than 2GB value,
but I don't know which real number must be passed to memlock() 
syscall, see below

2) 1.95 GB you see is just the MemTotal value (2046260 Rbytes) from 
 my 
/proc/meminfo 

3) At my perf_main test run it was:
(server) perf_main -send -trc ..., w/2- GB memlock limits
(client) perf_main -a10.0.0.1  w/800+ MB limit for user

But this run don't change the "whole picture": if I run client-side 
perf_main under root, I obtain the same mlock() problem on server side
working under guest w/same as root 2-GB limit. In this situation,
perf_main must works, because client side under root works (and
really has no mlock() problem), but server side under guest gives 
mlock() problem.

4) It looks for me that the problem is not in max amount of mem 
locked:
if I restrict max mem lockable to 800+ GB (as it was originally for 
guest) for root, and run mpirun_rsh, mvapich works OK. So under root
w/same limits as for guest, mvapich works.
  
5) I think that Andrey is right that mlock() don't works (in my 
particular kernel) for non-privelegged user.

I ran following simplest program:
int
main( int argc,
       char *argv[] )
{ int vva[10];
   int rc, sz;
  sz=10;
  rc = mlock(vva, sz);
  if (rc != 0) {printf("rc!=0\n")  ;
}
The output is rc!=0, and strace says
"mlock(0x7fbffff0a0, 117044)             = -1 EPERM (Operation not 
permitted)"

Does it means that the problem exist for any non-privelegged user
independed from amount of memory locked requested ?
And BTW, why mlock() in strace want to have 117044 instead of 10 words 
?

>I think there are two questions now:
>
>1) Can you run perf_main as user if you give mlock() permissions for 
>more than 2GB memory (provided the system has that much)?
>2) Why is perf_main requiring 2GB to be locked? My hunch is that there 
>is some build or other system issues which results in such behavior. 
>On our systems, definitely perf_main does not require that much 
>memory.
I'll write to Mellanox staff about perf_main.

Yours
Mikhail   



More information about the mvapich-discuss mailing list