列转行用到的是concat_ws和collect_list(collect_set),行转列用到的是lateral view explode和split。
1.列转行
测试数据
1 | hive> select * from col_row limit 10; |
把相同user_id的order_id按照逗号转为一行
1 | select |
结果(简写):
1 | user_id order_value |
总结
使用函数:concat_ws(‘,’, collect_set(column))
说明:collect_list 不去重,collect_set 去重。 column的数据类型要求是string
2.行转列
测试数据
1 | hive> select * from row_col; |
将order_value的每条记录切割为单元素
1 | select |
解析:
- split会将order_value字段按照指定的字符进行分割成集合;
- explode将集合进行展开;
- lateral view会根据explode结果生成一个侧视图,与row_col进行笛卡尔积操作;
结果如下:1
2
3
4
5
6
7
8
9
10
11
12user_id order_value order_id
104408 2909888,2662805,2922438,674972,2877863,190237 2909888
104408 2909888,2662805,2922438,674972,2877863,190237 2662805
104408 2909888,2662805,2922438,674972,2877863,190237 2922438
104408 2909888,2662805,2922438,674972,2877863,190237 674972
104408 2909888,2662805,2922438,674972,2877863,190237 2877863
104408 2909888,2662805,2922438,674972,2877863,190237 190237
104407 2982655,814964,1484250,2323912,2689723,2034331,1692373,677498,156562,2862492,338128 2982655
104407 2982655,814964,1484250,2323912,2689723,2034331,1692373,677498,156562,2862492,338128 814964
104407 2982655,814964,1484250,2323912,2689723,2034331,1692373,677498,156562,2862492,338128 1484250
104407 2982655,814964,1484250,2323912,2689723,2034331,1692373,677498,156562,2862492,338128 2323912
Time taken: 0.096 seconds, Fetched: 10 row(s)
You have to believe yourself, that’s the secret of the success.
书山有路勤为径,学海无涯苦作舟。
欢迎关注微信公众号:【程序员写书】
喜欢宠物的朋友可以关注:【电巴克宠物Pets】
一起学习,一起进步。
