06_Flume_interceptor_时间戳+Host

发布时间：2023-09-06 01:28责任编辑：傅花花关键词：时间戳

1、目标场景

2、flume agent配置文件

# 01 define agent name, source/sink/channel namea1.sources = r1a1.sinks = k1a1.channels = c1# 02 source,http,jsonhandlera1.sources.r1.type = httpa1.sources.r1.bind = mastera1.sources.r1.port = 6666a1.sources.r1.handler = org.apache.flume.source.http.JSONHandler# 03 timestamp and host interceptors work before sourcea1.sources.r1.interceptors = i1 i2 ??????# 两个interceptor串联，依次作用于eventa1.sources.r1.interceptors.i1.type = timestamp a1.sources.r1.interceptors.i1.preserveExisting = false ??a1.sources.r1.interceptors.i2.type = host  # flume event的头部将添加 “hostname”:实际主机名a1.sources.r1.interceptors.i2.hostHeader = hostname ?# 指定key,value将填充为flume agent所在节点的主机名a1.sources.r1.interceptors.i2.useIP = false ?????????# IP和主机名，二选一即可# 04 hdfs sinka1.sinks.k1.type = hdfs ?a1.sinks.k1.hdfs.path = hdfs://master:9000/flume/%Y-%m-%d/ ??# hdfs sink将根据event header中的时间戳进行替换# 和hostHeader的值保持一致，hdfs sink将提取event中key为hostnmae的值，基于该值创建文件名前缀a1.sinks.k1.hdfs.filePrefix = %{hostname} ??# hdfs sink将根据event header中的hostnmae对应的value进行替换a1.sinks.k1.hdfs.fileType = DataStreama1.sinks.k1.hdfs.writeFormat = Texta1.sinks.k1.hdfs.rollInterval = 0a1.sinks.k1.hdfs.rollCount = 10a1.sinks.k1.hdfs.rollSize = 1024000# channel,memorya1.channels.c1.type = memorya1.channels.c1.capacity = 1000a1.channels.c1.transactionCapacity = 100# bind source,sink to channel a1.sinks.k1.channel = c1a1.sources.r1.channels = c1

3、验证timestamp+host interceptor

验证思路：

1）先将interceptor作用后的event，通过logger sink打印到console，验证header是否正常添加
2）修改sink为hdfs, 观察目录和文件的名称是否能够按照预期创建（时间戳-目录，hostname-文件前缀）

验证过程：

1）发送header为空的http请求，logger sink打印event到终端，观察event header中是否被添加了timestamp以及hostname

2）ogger打印到console的event，header发生了变化

3）修改sink为hdfs, 观察HDFS的目录名（时间戳）和文件前缀（hostnme)

*目录名被正常替换(基于event header中的时间戳）

*文件前缀被正常替换（基于event header中的hostname:实际主机名）

* 文件内容被写入为event的body

06_Flume_interceptor_时间戳+Host

原文地址：http://www.cnblogs.com/shay-zhangjin/p/7965828.html

06_Flume_interceptor_时间戳+Host

1、目标场景

2、flume agent配置文件

3、验证timestamp+host interceptor

知识推荐