Skip to main content

CountText

Description

Counts various metrics on incoming text. The requested results will be recorded as attributes. The resulting flowfile will not have its content modified.

Tags

character, count, line, text, word

Properties

In the list below required Properties are shown with an asterisk (*). Other properties are considered optional. The table also indicates any default values, and whether a property supports the NiFi Expression Language.

Display NameAPI NameDefault ValueAllowable ValuesDescription
Count Lines *text-line-counttrue
  • true
  • false
If enabled, will count the number of lines present in the incoming text.
Count Non-Empty Lines *text-line-nonempty-countfalse
  • true
  • false
If enabled, will count the number of lines that contain a non-whitespace character present in the incoming text.
Count Words *text-word-countfalse
  • true
  • false
If enabled, will count the number of words (alphanumeric character groups bounded by whitespace) present in the incoming text. Common logical delimiters [_-.] do not bound a word unless 'Split Words on Symbols' is true.
Count Characters *text-character-countfalse
  • true
  • false
If enabled, will count the number of characters (including whitespace and symbols, but not including newlines and carriage returns) present in the incoming text.
Split Words on Symbols *split-words-on-symbolsfalse
  • true
  • false
If enabled, the word count will identify strings separated by common logical delimiters [ _ - . ] as independent words (ex. split-words-on-symbols = 4 words).
Character Encoding *character-encodingUTF-8
  • ISO-8859-1
  • UTF-8
  • UTF-16
  • UTF-16LE
  • UTF-16BE
  • US-ASCII
Specifies a character encoding to use.
Call Immediate Adjustment *ajust-immediatelyfalse
  • true
  • false
If true, the counter will be updated immediately, without regard to whether the ProcessSession is commit or rolled back;otherwise, the counter will be incremented only if and when the ProcessSession is committed.

Dynamic Properties

This component does not support dynamic properties.

Relationships

NameDescription
failureIf the flowfile text cannot be counted for some reason, the original file will be routed to this destination and nothing will be routed elsewhere
successThe flowfile contains the original content with one or more attributes added containing the respective counts

Reads Attributes

This processor does not read attributes.

Writes Attributes

NameDescription
text.character.countThe number of characters (given the specified character encoding) present in the original FlowFile
text.line.countThe number of lines of text present in the FlowFile content
text.line.nonempty.countThe number of lines of text (with at least one non-whitespace character) present in the original FlowFile
text.word.countThe number of words present in the original FlowFile

State Management

This component does not store state.

Restricted

This component is not restricted.

Input Requirement

This component requires an incoming relationship.

System Resource Considerations

This component does not specify system resource considerations.

See Also

SplitText