mapreduce - hadoop - how would input splits form if a file has only one record and the size of file is more than block size? -

- June 15, 2012

example explain question -

i have file of size 500mb (input.csv)

the file contains 1 line (record) in it

so how file stored in hdfs blocks , how input splits computed ?

you have check link: how hadoop process records split across block boundaries? pay attention 'remote read' mentioned.

the single record mentioned in question stored across many blocks. if use textinputformat read, mapper have perform remote-reads across blocks process record.

Search This Blog

Look

mapreduce - hadoop - how would input splits form if a file has only one record and the size of file is more than block size? -

Comments

Post a Comment

Popular posts from this blog

filehandler - java open files not cleaned, even when the process is killed -

java - Suppress Jboss version details from HTTP error response -

gridview - Yii2 DataPorivider $totalSum for a column -