不能在HDFS Data节点上创建临时文件

在新创建的Hadoop边缘节点上,尝试通过Hive CLI模式进行数据插入操作,结果没有出现意想中的成功信息,反倒是捕获到如下的异常:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
FAILED: SemanticException [Error 10293]: Unable to create temp file for insert values File /tmp/hive/kylin/9c84de0a-fca2-4d3c-8f72-47436a4adb83/_tmp_space.db/Values__Tmp__Table__1/data_file could only be replicated to 0 nodes instead of minReplication (=1).  There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
	at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1720)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3440)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:686)
	at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.addBlock(AuthorizationProviderProxyClientProtocol.java:217)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:506)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1073)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2226)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2222)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2220)

ERROR: Current user has no permission to create Hive table in working directory: /user/kylin

从异常提示信息上来面,初步判定为对/user/kylin目录没有权限(有点奇怪明明就是kylin用户为何会没有权限操作),简单直接的把其权限降低到777后,错误仍然是存在。接着尝试切换到Hive的Beeline连接方式,重复上原来的插入语句,操作成功了!那上面的错误是何原因引起的呢?

借助强大的Google搜索查找了一番,结果各说纷纭:有说是HDFS存储空间不足,有的说是集群节点的防火墙未关闭,有的说是DataNode服务异常 等等。网上的方案都尝试过了,问题仍然是没有解决。由前的防火墙联想到会不会是IP引起的问题 。

因为集群是本地虚拟机搭建的,而恰巧又配置了双网卡,而边缘节点连接的是集静态IP地址。如下:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
eth0      Link encap:Ethernet  HWaddr 08:00:27:B2:38:58  
          inet addr:10.0.2.15  Bcast:10.0.2.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:797 errors:0 dropped:0 overruns:0 frame:0
          TX packets:944 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:98791 (96.4 KiB)  TX bytes:84770 (82.7 KiB)

eth1      Link encap:Ethernet  HWaddr 08:00:27:B5:9D:6A  
          inet addr:192.168.56.104  Bcast:192.168.56.255  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:3523935 errors:0 dropped:0 overruns:0 frame:0
          TX packets:443589 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:5073146719 (4.7 GiB)  TX bytes:163351146 (155.7 MiB)

lo        Link encap:Local Loopback  
          inet addr:127.0.0.1  Mask:255.0.0.0
          UP LOOPBACK RUNNING  MTU:65536  Metric:1
          RX packets:342031 errors:0 dropped:0 overruns:0 frame:0
          TX packets:342031 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0 
          RX bytes:405110832 (386.3 MiB)  TX bytes:405110832 (386.3 MiB)

接着检查了下/etc/hosts的文件配置,结果真是默认设置10.0.2.15地址为集群IP,将其修改为静态IP地址并重启Hadoop集群的所有服务,再次通过Hive CLI模式连接Hive,执行之前的插入语句一切正常。

总结

配置Hadoop集群要特别注意IP地址的分配,建议还是通过HostName形式来避免IP地址问题。另外当已有案例不能协助解决问题时,可仔细检查环境的配置情况。