About InputFlowFile
Interface
org.apache.nifi.python.processor.InputFlowFile
; Part of the Java Framework
A FlowFile passed to the transform
method is an instance of a proxy object implementing an InputFlowFile
interface.
This object has no direct counterpart in the Python NiFi Framework.
Attributes
Name | Type | Description |
---|---|---|
getContentsAsBytes() | Callable | Returns bytes ;Return value is the contents of FlowFile as a byte array |
getContentsAsReader() | Callable | Returns BufferedReader ;Return value is a buffer that wraps the FlowFile's content, allowing an access to a single line at the time |
getSize() | Callable | Returns int ;Return value represents the FlowFile size in bytes |
getAttribute(String) | Callable | Returns Mixed; Get the value of an attribute identified by its name; If attribute does not exists, returns None |
getAttributes() | Callable | Returns dict ;Get a dictionary containing all FlowFile attributes and their values |
e.g.:
from nifiapi.flowfiletransform import (
FlowFileTransform,
FlowFileTransformResult,
)
from nifiapi.properties import ProcessContext
from json import loads
class Processor(FlowFileTransform):
(...)
def transform(
self, context: ProcessContext, flow_file
) -> FlowFileTransformResult:
'''
Parameters:
context (ProcessContext)
flow_file
Returns:
FlowFileTransformResult
'''
# Decode JSON encoded FlowFile contents
json_contents = loads(flow_file.getContentsAsBytes())
return FlowFileTransformResult('success')
In the above example, the parameter flow_file
is an instance of InputFlowFile
, automatically provisioned by the proxy.
getContentsAsReader
This method returns a BufferedReader that wraps the FlowFile's content. This allows a single line of text to be read at a time, rather than buffering the entirety of the FlowFile's content into memory.
This method requires the proxy to serialise and deserialise each line, causing substantial delays.
getAttribute
FlowFile attributes in Apache NiFi are key-value pairs associated with a FlowFile that provide metadata about the data it contains. These attributes can be used to make routing and processing decisions within a data flow.
Attributes that always exist:
uuid
filename
path
Example of accessing FlowFile attributes:
# Get an attribute by name
filename = flow_file.getAttribute('filename')
# Get a dictonary of all FlowFile attributes
# Note: getAttributes methods returns an instance of
# py4j.java_collections.JavaMap class.
attributes = dict(flow_file.getAttributes())