apache spark - Why one RDD count job takes so much time -
i used newapihadooprdd() method load hbase records rdd , simple count job.
however, count job takes lots of time far more can imagine. checked codes, thinking may in hbase, 1 column family has data, , when load records rdd, data may cause executors memory overflow.
is possible reason cause issue?
Comments
Post a Comment