[Hadoop-RDMA] Troubles with deploying hadoop-rdma-0.9.8 over IB cluster
Alexander Frolov
alexndr.frolov at gmail.com
Wed Feb 5 10:22:00 EST 2014
Hello,
I am trying to deploy Hadoop-RDMA on 8 node IB (OFED-1.5.3-4.0.42) cluster
and got into the following problem (a.k.a File ... could only be replicated
to 0 nodes, instead of 1):
frolo at A11:~/hadoop-rdma-0.9.8> ./bin/hadoop dfs -copyFromLocal ../pg132.txt
/user/frolo/input/pg132.txt
Warning: $HADOOP_HOME is deprecated.
14/02/05 19:06:30 WARN hdfs.DFSClient: DataStreamer Exception:
java.lang.reflect.UndeclaredThrowableException
at com.sun.proxy.$Proxy1.addBlock(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(Unknown
Source)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(Unknown Source)
at com.sun.proxy.$Proxy1.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.From.Code(Unknown Source)
at org.apache.hadoop.hdfs.From.F(Unknown Source)
at org.apache.hadoop.hdfs.From.F(Unknown Source)
at org.apache.hadoop.hdfs.The.run(Unknown Source)
Caused by: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File
/user/frolo/input/pg132.txt could only be replicated to 0 nodes, instead of
1
at
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(Unknown
Source)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(Unknown Source)
at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.ipc.RPC$Server.call(Unknown Source)
at org.apache.hadoop.ipc.rdma.madness.Code(Unknown Source)
at org.apache.hadoop.ipc.rdma.madness.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(Unknown Source)
at org.apache.hadoop.ipc.rdma.be.run(Unknown Source)
at org.apache.hadoop.ipc.rdma.RDMAClient.Code(Unknown Source)
at org.apache.hadoop.ipc.rdma.RDMAClient.call(Unknown Source)
at org.apache.hadoop.ipc.Tempest.invoke(Unknown Source)
... 12 more
14/02/05 19:06:30 WARN hdfs.DFSClient: Error Recovery for null bad
datanode[0] nodes == null
14/02/05 19:06:30 WARN hdfs.DFSClient: Could not get block locations.
Source file "/user/frolo/input/pg132.txt" - Aborting...
14/02/05 19:06:30 INFO hdfs.DFSClient: exception in isClosed
It seems that data is not transferred to DataNodes when I start copying
from local filesystem to HDFS. I tested availability of DataNodes:
frolo at A11:~/hadoop-rdma-0.9.8> ./bin/hadoop dfsadmin -report
Warning: $HADOOP_HOME is deprecated.
Configured Capacity: 0 (0 KB)
Present Capacity: 0 (0 KB)
DFS Remaining: 0 (0 KB)
DFS Used: 0 (0 KB)
DFS Used%: �%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
-------------------------------------------------
Datanodes available: 0 (4 total, 4 dead)
Name: 10.10.1.13:50010
Decommission Status : Normal
Configured Capacity: 0 (0 KB)
DFS Used: 0 (0 KB)
Non DFS Used: 0 (0 KB)
DFS Remaining: 0(0 KB)
DFS Used%: 100%
DFS Remaining%: 0%
Last contact: Wed Feb 05 19:02:54 MSK 2014
Name: 10.10.1.14:50010
Decommission Status : Normal
Configured Capacity: 0 (0 KB)
DFS Used: 0 (0 KB)
Non DFS Used: 0 (0 KB)
DFS Remaining: 0(0 KB)
DFS Used%: 100%
DFS Remaining%: 0%
Last contact: Wed Feb 05 19:02:54 MSK 2014
Name: 10.10.1.16:50010
Decommission Status : Normal
Configured Capacity: 0 (0 KB)
DFS Used: 0 (0 KB)
Non DFS Used: 0 (0 KB)
DFS Remaining: 0(0 KB)
DFS Used%: 100%
DFS Remaining%: 0%
Last contact: Wed Feb 05 19:02:54 MSK 2014
Name: 10.10.1.11:50010
Decommission Status : Normal
Configured Capacity: 0 (0 KB)
DFS Used: 0 (0 KB)
Non DFS Used: 0 (0 KB)
DFS Remaining: 0(0 KB)
DFS Used%: 100%
DFS Remaining%: 0%
Last contact: Wed Feb 05 19:02:55 MSK 2014
and tried to mkdir in HDFS filesystem which has been successful. Restarting
of Hadoop daemons have not produced any positive effect.
Could you please help me with this issue? Thank you.
Best,
Alex
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/hibd-announce/attachments/20140205/e10c9aa6/attachment.html>
More information about the hibd-announce
mailing list