[mvapich-discuss] Errors trying to build MVAPICH2-1.5 with the PANFS IO device

Jonathan Perkins perkinjo at cse.ohio-state.edu
Wed Aug 18 12:41:13 EDT 2010


The patch that I applied was only to the 1.5 branch and not trunk.
I'm not sure which tarball you may have grabbed.  Can you check with
http://mvapich.cse.ohio-state.edu/nightly/mvapich2/branches/1.5/mvapich2-1.5-2010-08-17.tar.gz?
 I've already checked and this contains the patch that I committed
yesterday.  Please let us know if this resolves your problem.

On Wed, Aug 18, 2010 at 10:57 AM, David Gunter <dog at lanl.gov> wrote:
> I think you'll have to send me the patch. I grabbed last night's tarball (10/17) and it, too, has the exact same problem.
>
> -david
>
> --
> David Gunter
> HPC-3: Infrastructure Team
> Los Alamos National Laboratory
>
>
>
>
> On Aug 17, 2010, at 9:54 AM, Jonathan Perkins wrote:
>
>> This is true.  It looks like that I've only applied the runtime fix
>> and not the build fix.  I've just committed this fix in our 1.5 branch
>> and it should be available in tonight's nightly tarball.  If you'd
>> like, you can checkout the source directly from
>> https://mvapich.cse.ohio-state.edu/svn/mpi/mvapich2/branches/1.5.  Or
>> I can send you the patch directly.  Let me know what works for you.
>>
>> On Tue, Aug 17, 2010 at 9:45 AM, David Gunter <dog at lanl.gov> wrote:
>>> Unfortunately this release has the exact same issue I just encountered. In fact, mvapich2-1.5-2010-08-15/src/mpi/romio/adio/ad_panfs/Makefile.in is identical to the 1.5 release.
>>>
>>> -david
>>>
>>> --
>>> David Gunter
>>> HPC-3: Infrastructure Team
>>> Los Alamos National Laboratory
>>>
>>>
>>>
>>>
>>> On Aug 16, 2010, at 8:50 PM, Jonathan Perkins wrote:
>>>
>>>> Hi David, thanks for the post.  This issue came up earlier in our
>>>> discussion list
>>>>
>>>> http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/2010-July/002926.html
>>>>
>>>> We have committed a fix to our 1.5 branch late last month for this.
>>>> Can you try our latest nightly tarball and see if this resolves the
>>>> problem for you?  Here is a link for your convenience...
>>>>
>>>> http://mvapich.cse.ohio-state.edu/nightly/mvapich2/branches/1.5/mvapich2-1.5-2010-08-15.tar.gz
>>>>
>>>> On Mon, Aug 16, 2010 at 7:20 PM, David Gunter <dog at lanl.gov> wrote:
>>>>> Here at LANL we use the Panasas parallel file system.  The recent MVAPICH2-1.5 refuses to build and I think I have figured out why. It has to do with some changes made that aren't present in the 1.4.1 release (the last version we have working here).
>>>>>
>>>>> mvapich2-1.5/src/mpi/romio/adio/ad_panfs/ad_panfs.h requires adio.h which requires adioi.h. The latter has been changed to now include a file mpiimpl.h which lives at
>>>>>
>>>>> mvapich2-1.5/src/include/mpiimpl.h
>>>>>
>>>>> Things would be fine except that mpiimpl.h includes files in directories that are not included in
>>>>>
>>>>> mvapich2-1.5/src/mpi/romio/adio/ad_panfs/Makefile.in. Namely, mpiimpl.h includes
>>>>>
>>>>> mvapich2-1.5/src/mpid/ch3/include/mpidpre.h
>>>>>
>>>>> and
>>>>>
>>>>> mvapich2-1.5/src/mpid/ch3/include/mpid_thread.h
>>>>>
>>>>> Thus a path to that include directory should be added to the above mentioned Makefile.in.  However, that doesn't fix the problem as mpidpre.h includes other files that are not in the list of include directories.
>>>>>
>>>>> mpidpre.h includes "mpid_dataloop.h" and "mpidi_ch3_pre.h". The first of these is
>>>>>
>>>>> mvapich2-1.5/src/mpid/common/datatype/mpid_dataloop.h
>>>>>
>>>>> but the second is found in multiple places depending on the channel one is building for, mrail in our case, so
>>>>>
>>>>> mvapich2-1.5/src/mpid/ch3/channels/mrail/include
>>>>>
>>>>> So that means two more sets of directories added to the Makefile.in in the ad_panfs directory.
>>>>>
>>>>> Finally, mpidi_ch3_pre.h includes one more file, "mpidi_ch3_rdma_pre.h", which is found in two different places depending on the device (gen2 or udapl). We build for gen2, so it is found in
>>>>>
>>>>> mvapich2-1.5/src/mpid/ch3/channels/mrail/src/gen2/mpidi_ch3_rdma_pre.h
>>>>>
>>>>> and that is one more include-directory to add to Makefile.in.
>>>>>
>>>>> After doing all that, I finally get past the Romio build portion and the ad_panfs device is built with some warnings, but no errors.  The make process proceeds smoothly and dies again towards the end. I'm not sure if the final errors are related to the above issue.
>>>>>
>>>>> make[4]: Entering directory `/usr/projects/hpctools/dog/mvapich2/mvapich2-1.5/src/pm/util'
>>>>> gcc -I/opt/ofed/include  -g -DNDEBUG -O2   -L/opt/ofed/lib64  -o mpiexec mpiexec.o  \
>>>>>        ../util/libmpiexec.a ../../../lib/libmpich.a -L/opt/ofed/lib64 -libverbs -libumad -lpthread
>>>>> ../../../lib/libmpich.a(ibv_param.o): In function `rdma_cm_get_hca_type':
>>>>> /usr/projects/hpctools/dog/mvapich2/mvapich2-1.5/src/mpid/ch3/channels/mrail/src/gen2/ibv_param.c:370: undefined referen
>>>>> ce to `rdma_get_devices'
>>>>> /usr/projects/hpctools/dog/mvapich2/mvapich2-1.5/src/mpid/ch3/channels/mrail/src/gen2/ibv_param.c:416: undefined referen
>>>>> ce to `rdma_free_devices'
>>>>> ../../../lib/libmpich.a(rdma_cm.o): In function `ib_finalize_rdma_cm':
>>>>> /usr/projects/hpctools/dog/mvapich2/mvapich2-1.5/src/mpid/ch3/channels/mrail/src/gen2/rdma_cm.c:1229: undefined reference to `rdma_disconnect'
>>>>> /usr/projects/hpctools/dog/mvapich2/mvapich2-1.5/src/mpid/ch3/channels/mrail/src/gen2/rdma_cm.c:1230: undefined reference to `rdma_destroy_qp'
>>>>> /usr/projects/hpctools/dog/mvapich2/mvapich2-1.5/src/mpid/ch3/channels/mrail/src/gen2/rdma_cm.c:1274: undefined reference to `rdma_destroy_id'
>>>>> /usr/projects/hpctools/dog/mvapich2/mvapich2-1.5/src/mpid/ch3/channels/mrail/src/gen2/rdma_cm.c:1283: undefined reference to `rdma_destroy_id'
>>>>> /usr/projects/hpctools/dog/mvapich2/mvapich2-1.5/src/mpid/ch3/channels/mrail/src/gen2/rdma_cm.c:1285: undefined reference to `rdma_destroy_event_channel'
>>>>> ../../../lib/libmpich.a(rdma_cm.o): In function `rdma_cm_connect_to_server':
>>>>> /usr/projects/hpctools/dog/mvapich2/mvapich2-1.5/src/mpid/ch3/channels/mrail/src/gen2/rdma_cm.c:964: undefined reference to `rdma_create_id'
>>>>> /usr/projects/hpctools/dog/mvapich2/mvapich2-1.5/src/mpid/ch3/channels/mrail/src/gen2/rdma_cm.c:976: undefined reference to `rdma_resolve_addr'
>>>>> ../../../lib/libmpich.a(rdma_cm.o): In function `rdma_cm_create_qp':
>>>>> /usr/projects/hpctools/dog/mvapich2/mvapich2-1.5/src/mpid/ch3/channels/mrail/src/gen2/rdma_cm.c:794: undefined reference to `rdma_create_qp'
>>>>> ../../../lib/libmpich.a(rdma_cm.o): In function `rdma_cm_get_contexts':
>>>>> /usr/projects/hpctools/dog/mvapich2/mvapich2-1.5/src/mpid/ch3/channels/mrail/src/gen2/rdma_cm.c:728: undefined reference to `rdma_create_id'
>>>>> /usr/projects/hpctools/dog/mvapich2/mvapich2-1.5/src/mpid/ch3/channels/mrail/src/gen2/rdma_cm.c:737: undefined reference to `rdma_resolve_addr'
>>>>> /usr/projects/hpctools/dog/mvapich2/mvapich2-1.5/src/mpid/ch3/channels/mrail/src/gen2/rdma_cm.c:748: undefined reference to `rdma_destroy_id'
>>>>> ../../../lib/libmpich.a(rdma_cm.o): In function `ib_init_rdma_cm':
>>>>> /usr/projects/hpctools/dog/mvapich2/mvapich2-1.5/src/mpid/ch3/channels/mrail/src/gen2/rdma_cm.c:599: undefined reference to `rdma_create_event_channel'
>>>>> /usr/projects/hpctools/dog/mvapich2/mvapich2-1.5/src/mpid/ch3/channels/mrail/src/gen2/rdma_cm.c:653: undefined reference to `rdma_create_id'
>>>>> ../../../lib/libmpich.a(rdma_cm.o): In function `bind_listen_port':
>>>>> /usr/projects/hpctools/dog/mvapich2/mvapich2-1.5/src/mpid/ch3/channels/mrail/src/gen2/rdma_cm.c:547: undefined reference to `rdma_bind_addr'
>>>>> /usr/projects/hpctools/dog/mvapich2/mvapich2-1.5/src/mpid/ch3/channels/mrail/src/gen2/rdma_cm.c:557: undefined reference to `rdma_bind_addr'
>>>>> /usr/projects/hpctools/dog/mvapich2/mvapich2-1.5/src/mpid/ch3/channels/mrail/src/gen2/rdma_cm.c:566: undefined reference to `rdma_listen'
>>>>> ../../../lib/libmpich.a(rdma_cm.o): In function `ib_cma_event_handler':
>>>>> /usr/projects/hpctools/dog/mvapich2/mvapich2-1.5/src/mpid/ch3/channels/mrail/src/gen2/rdma_cm.c:124: undefined reference to `rdma_resolve_route'
>>>>> /usr/projects/hpctools/dog/mvapich2/mvapich2-1.5/src/mpid/ch3/channels/mrail/src/gen2/rdma_cm.c:184: undefined reference to `rdma_connect'
>>>>> /usr/projects/hpctools/dog/mvapich2/mvapich2-1.5/src/mpid/ch3/channels/mrail/src/gen2/rdma_cm.c:281: undefined reference to `rdma_accept'
>>>>> /usr/projects/hpctools/dog/mvapich2/mvapich2-1.5/src/mpid/ch3/channels/mrail/src/gen2/rdma_cm.c:240: undefined reference to `rdma_reject'
>>>>> ../../../lib/libmpich.a(rdma_cm.o): In function `cm_thread':
>>>>> /usr/projects/hpctools/dog/mvapich2/mvapich2-1.5/src/mpid/ch3/channels/mrail/src/gen2/rdma_cm.c:420: undefined reference to `rdma_ack_cm_event'
>>>>> /usr/projects/hpctools/dog/mvapich2/mvapich2-1.5/src/mpid/ch3/channels/mrail/src/gen2/rdma_cm.c:403: undefined reference to `rdma_get_cm_event'
>>>>> collect2: ld returned 1 exit status
>>>>> make[3]: *** [mpiexec] Error 1
>>>>> make[3]: Leaving directory `/usr/projects/hpctools/dog/mvapich2/mvapich2-1.5/src/pm/gforker'
>>>>> make[2]: *** [all-redirect] Error 1
>>>>> make[2]: Leaving directory `/usr/projects/hpctools/dog/mvapich2/mvapich2-1.5/src/pm'
>>>>> make[1]: *** [all-redirect] Error 2
>>>>> make[1]: Leaving directory `/usr/projects/hpctools/dog/mvapich2/mvapich2-1.5/src'
>>>>> make: *** [all-redirect] Error 2
>>>>>
>>>>> -david
>>>>> --
>>>>> David Gunter
>>>>> HPC-3: Infrastructure Team
>>>>> Los Alamos National Laboratory
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> mvapich-discuss mailing list
>>>>> mvapich-discuss at cse.ohio-state.edu
>>>>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>>>>
>>>>>
>>>>
>>>> --
>>>> Jonathan Perkins
>>>
>>>
>>>
>>
>>
>>
>> --
>> Jonathan Perkins
>
>
>



-- 
Jonathan Perkins



More information about the mvapich-discuss mailing list