ExtractDocumentRawText
Description
Extracts the text from a Document and writes it to the FlowFile content. This does not include any text found in any Processing Elements.
Tags
datavolo, document, rag, retrieval augmented generation, text, unstructured
Properties
In the list below required Properties are shown with an asterisk (*). Other properties are considered optional. The table also indicates any default values, and whether a property supports the NiFi Expression Language.
Display Name | API Name | Default Value | Allowable Values | Description |
---|
Dynamic Properties
This component does not support dynamic properties.
Relationships
Name | Description |
---|---|
failure | If the text of a FlowFile cannot be extracted for any reason, the input FlowFile will be routed to this relationship. |
success | The text of the PDF is routed to the success relationship. |
Reads Attributes
This processor does not read attributes.
Writes Attributes
This processor does not write attributes.
State Management
This component does not store state.
Restricted
This component is not restricted.
Input Requirement
This component requires an incoming relationship.
System Resource Considerations
This component does not specify system resource considerations.