Skip to main content

PutBigQuery

Description

Writes the contents of a FlowFile to a Google BigQuery table. The processor is record based so the schema that is used is driven by the RecordReader. Attributes that are not matched to the target schema are skipped. Exactly once delivery semantics are achieved via stream offsets.

Tags

bigquery, bq, google, google cloud

Properties

In the list below required Properties are shown with an asterisk (*). Other properties are considered optional. The table also indicates any default values, and whether a property supports the NiFi Expression Language.

Display NameAPI NameDefault ValueAllowable ValuesDescription
GCP Credentials Provider Service *GCP Credentials Provider ServiceController Service:
GCPCredentialsService

Implementations:
GCPCredentialsControllerService
The Controller Service used to obtain Google Cloud Platform credentials.
Project ID *gcp-project-idGoogle Cloud Project ID

Supports Expression Language, using Environment variables.
BigQuery API Endpoint *bigquery-api-endpointbigquerystorage.googleapis.com:443Can be used to override the default BigQuery endpoint. Default is bigquerystorage.googleapis.com:443. Format must be hostname:port.

Supports Expression Language, using Environment variables.
Dataset *bq.dataset${bq.dataset}BigQuery dataset name (Note - The dataset must exist in GCP)

Supports Expression Language, using FlowFile attributes and Environment variables.
Table Name *bq.table.name${bq.table.name}BigQuery table name

Supports Expression Language, using FlowFile attributes and Environment variables.
Record Reader *bq.record.readerController Service:
RecordReaderFactory

Implementations:
AvroReader
CEFReader
CSVReader
ExcelReader
GrokReader
JsonPathReader
JsonTreeReader
ReaderLookup
ScriptedReader
Syslog5424Reader
SyslogReader
WindowsEventLogReader
XMLReader
YamlTreeReader
Specifies the Controller Service to use for parsing incoming data.
Transfer Type *bq.transfer.typeSTREAM
  • STREAM
  • BATCH
Defines the preferred transfer type streaming or batching
Append Record Count *bq.append.record.count20The number of records to be appended to the write stream at once. Applicable for both batch and stream types
Number of retries *gcp-retry-count6How many retry attempts should be made before routing to the failure relationship.
Skip Invalid Rows *bq.skip.invalid.rowsfalseSets whether to insert all valid rows of a request, even if invalid rows exist. If not set the entire insert request will fail if it contains an invalid row.

Supports Expression Language, using FlowFile attributes and Environment variables.
Proxy Configuration Serviceproxy-configuration-serviceController Service:
ProxyConfigurationService

Implementations:
StandardProxyConfigurationService
Specifies the Proxy Configuration Controller Service to proxy network requests. Supported proxies: HTTP + AuthN

Dynamic Properties

This component does not support dynamic properties.

Relationships

NameDescription
failureFlowFiles are routed to this relationship if the Google BigQuery operation fails.
successFlowFiles are routed to this relationship after a successful Google BigQuery operation.

Reads Attributes

This processor does not read attributes.

Writes Attributes

NameDescription
bq.records.countNumber of records successfully inserted

State Management

This component does not store state.

Restricted

This component is not restricted.

Input Requirement

This component requires an incoming relationship.

System Resource Considerations

This component does not specify system resource considerations.

See Also