hadoop - How to group bag in pig -
first have data , group
a = load './test.txt' using pigstorage(' ') (id:int, time:int, value:float); b = group time;
for example result have structure this.
1001 {(1,1001,0.2),(3,1001,0.3),(2,1001,0.3),(4,1001,0.6)} 1002 {(2,1002,0.5),(1,1002,0.3),(3,1002,0.1),(4,1002,0.6)} 1003 {(4,1003,0.2),(1,1003,0.8),(2,1003,0.4),(3,1003,0.5)}
but want
1001 {(1,1001,0.2),(2,1001,0.3),(3,1001,0.3),(4,1001,0.6)} 1002 {(1,1002,0.3),(2,1002,0.5),(3,1002,0.1),(4,1002,0.6)} 1003 {(1,1003,0.8),(2,1003,0.4),(3,1003,0.5),(4,1003,0.2)}
use nested foreach
c = foreach b { sort_by_id = order id; generate group, sort_by_id ; };
Comments
Post a Comment