Skip to main content

About InputFlowFile

Interface org.apache.nifi.python.processor.InputFlowFile; Part of the Java Framework

A FlowFile passed to the transform method is an instance of a proxy object implementing an InputFlowFile interface. This object has no direct counterpart in the Python NiFi Framework.

Attributes

NameTypeDescription
getContentsAsBytes()CallableReturns bytes;

Return value is the contents of FlowFile as a byte array
getContentsAsReader()CallableReturns BufferedReader;

Return value is a buffer that wraps the FlowFile's content, allowing an access to a single line at the time
getSize()CallableReturns int;

Return value represents the FlowFile size in bytes
getAttribute(String)CallableReturns Mixed;

Get the value of an attribute identified by its name;

If attribute does not exists, returns None
getAttributes()CallableReturns dict;

Get a dictionary containing all FlowFile attributes and their values

e.g.:

from nifiapi.flowfiletransform import (
FlowFileTransform,
FlowFileTransformResult,
)
from nifiapi.properties import ProcessContext
from json import loads


class Processor(FlowFileTransform):
(...)

def transform(
self, context: ProcessContext, flow_file
) -> FlowFileTransformResult:
'''
Parameters:
context (ProcessContext)
flow_file

Returns:
FlowFileTransformResult
'''
# Decode JSON encoded FlowFile contents
json_contents = loads(flow_file.getContentsAsBytes())

return FlowFileTransformResult('success')

note

In the above example, the parameter flow_file is an instance of InputFlowFile, automatically provisioned by the proxy.

getContentsAsReader

This method returns a BufferedReader that wraps the FlowFile's content. This allows a single line of text to be read at a time, rather than buffering the entirety of the FlowFile's content into memory.

caution

This method requires the proxy to serialise and deserialise each line, causing substantial delays.

getAttribute

FlowFile attributes in Apache NiFi are key-value pairs associated with a FlowFile that provide metadata about the data it contains. These attributes can be used to make routing and processing decisions within a data flow.

Attributes that always exist:

  • uuid
  • filename
  • path

Example of accessing FlowFile attributes:

# Get an attribute by name
filename = flow_file.getAttribute('filename')

# Get a dictonary of all FlowFile attributes
# Note: getAttributes methods returns an instance of
# py4j.java_collections.JavaMap class.
attributes = dict(flow_file.getAttributes())