分享web开发知识

注册/登录|最近发布|今日推荐

主页 IT知识网页技术软件开发前端开发代码编程运营维护技术分享教程案例
当前位置:首页 > 前端开发

通过 flume 上传数据到hive

发布时间:2023-09-06 01:54责任编辑:顾先生关键词:暂无标签
目标: ?通过接受 1084端口的http请求信息, 存储到 hive数据库中,
osgiweb2.db为hive中创建的数据库名称
periodic_report5 为创建的数据表,

flume配置如下:
a1.sources=r1 ?a1.channels=c1 ?a1.sinks=k1 ???a1.sources.r1.type = httpa1.sources.r1.bind = 0.0.0.0a1.sources.r1.port = 1084a1.sources.r1.handler=jkong.Test.HTTPSourceDPIHandler ?

#a1.sources.r1.interceptors=i1 i2
#a1.sources.r1.interceptors.i1.type=regex_filter
#a1.sources.r1.interceptors.i1.regex=\\{.*\\}
#a1.sources.r1.interceptors.i2.type=timestamp
a1.channels.c1.type=memory ?a1.channels.c1.capacity=10000 ?a1.channels.c1.transactionCapacity=1000 ?a1.channels.c1.keep-alive=30 ???a1.sinks.k1.type=hdfs ?a1.sinks.k1.channel=c1 ?a1.sinks.k1.hdfs.path=hdfs://gw-sp.novalocal:1086/user/hive/warehouse/osgiweb2.db/periodic_report5 a1.sinks.k1.hdfs.fileType=DataStream ?a1.sinks.k1.hdfs.writeFormat=Text ?a1.sinks.k1.hdfs.rollInterval=0 ?a1.sinks.k1.hdfs.rollSize=10240 ?a1.sinks.k1.hdfs.rollCount=0 ?a1.sinks.k1.hdfs.idleTimeout=60a1.sources.r1.channels=c1a1.sinks.k1.channel=c1

 2.  数据表创建:

create table periodic_report5(id BIGINT, deviceId STRING,report_time STRING,information STRING) row format serde "org.openx.data.jsonserde.JsonSerDe" WITH SERDEPROPERTIES("id"="$.id","deviceId"="$.deviceId","report_time"="$.report_time","information"="$.information"); 

  2.1  将数据表中的字段也同样拆分成数据字段的创表语句(还没有试验, 暂时不用)

create table periodic_report4(id BIGINT, deviceId STRING,report_time STRING,information STRUCT<actualTime:BIGINT,dpiVersionInfo:STRING,subDeviceInfo:STRING,wanTrafficData:STRING,ponInfo:STRING,eventType:STRING,potsInfo:STRING,deviceInfo:STRING,deviceStatus:STRING>) row format serde "org.openx.data.jsonserde.JsonSerDe" WITH SERDEPROPERTIES("input.invalid.ignore"="true","id"="$.id","deviceId"="$.deviceId","report_time"="$.report_time","requestParams.actualTime"="$.requestParams.actualTime","requestParams.dpiVersionInfo"="$.requestParams.dpiVersionInfo","requestParams.subDeviceInfo"="$.requestParams.subDeviceInfo","requestParams.wanTrafficData"="$.requestParams.wanTrafficData","requestParams.ponInfo"="$.requestParams.ponInfo","requestParams.eventType"="$.requestParams.eventType","requestParams.potsInfo"="$.requestParams.potsInfo","requestParams.deviceInfo"="$.requestParams.deviceInfo","requestParams.deviceStatus"="$.requestParams.deviceStatus"); 

3. 启动flume语句:flume 根目录

bin/flume-ng agent --conf ./conf/ -f ./conf/flume.conf --name a1 -Dflume.root.logger=DEBUG,console

4. 启动hive语句: hive bin目录

hive ???或者:./hive -hiveconf hive.root.logger=DEBUG,console ?#带log信息启动

通过 flume 上传数据到hive

原文地址:https://www.cnblogs.com/redhat0019/p/9055335.html

知识推荐

我的编程学习网——分享web前端后端开发技术知识。 垃圾信息处理邮箱 tousu563@163.com 网站地图
icp备案号 闽ICP备2023006418号-8 不良信息举报平台 互联网安全管理备案 Copyright 2023 www.wodecom.cn All Rights Reserved