Skip to main content

FetchAzureDataLakeStorage

Description

Fetch the specified file from Azure Data Lake Storage

Tags

adlsgen2, azure, cloud, datalake, microsoft, storage

Properties

In the list below required Properties are shown with an asterisk (*). Other properties are considered optional. The table also indicates any default values, and whether a property supports the NiFi Expression Language.

Display NameAPI NameDefault ValueAllowable ValuesDescription
ADLS Credentials *adls-credentials-serviceController Service:
ADLSCredentialsService

Implementations:
ADLSCredentialsControllerServiceLookup
ADLSCredentialsControllerService
Controller Service used to obtain Azure Credentials.
Filesystem Name *filesystem-nameName of the Azure Storage File System (also called Container). It is assumed to be already existing.

Supports Expression Language, using FlowFile attributes and Environment variables.
Directory Name *directory-nameName of the Azure Storage Directory. The Directory Name cannot contain a leading '/'. The root directory can be designated by the empty string value. In case of the PutAzureDataLakeStorage processor, the directory will be created if not already existing.

Supports Expression Language, using FlowFile attributes and Environment variables.
File Name *file-name${azure.filename}The filename

Supports Expression Language, using FlowFile attributes and Environment variables.
Range Startrange-startThe byte position at which to start reading from the object. An empty value or a value of zero will start reading at the beginning of the object.

Supports Expression Language, using FlowFile attributes and Environment variables.
Range Lengthrange-lengthThe number of bytes to download from the object, starting from the Range Start. An empty value or a value that extends beyond the end of the object will read to the end of the object.

Supports Expression Language, using FlowFile attributes and Environment variables.
Number of Retriesnumber-of-retries0The number of automatic retries to perform if the download fails.

Supports Expression Language, using FlowFile attributes and Environment variables.
Proxy Configuration Serviceproxy-configuration-serviceController Service:
ProxyConfigurationService

Implementations:
StandardProxyConfigurationService
Specifies the Proxy Configuration Controller Service to proxy network requests. Supported proxies: SOCKS, HTTP In case of SOCKS, it is not guaranteed that the selected SOCKS Version will be used by the processor.

Dynamic Properties

This component does not support dynamic properties.

Relationships

NameDescription
failureFiles that could not be written to Azure storage for some reason are transferred to this relationship
successFiles that have been successfully written to Azure storage are transferred to this relationship

Reads Attributes

This processor does not read attributes.

Writes Attributes

NameDescription
azure.datalake.storage.errorCodeThe Azure Data Lake Storage moniker of the failed operation
azure.datalake.storage.errorMessageThe Azure Data Lake Storage error message from the failed operation
azure.datalake.storage.statusCodeThe HTTP error code (if available) from the failed operation

State Management

This component does not store state.

Restricted

This component is not restricted.

Input Requirement

This component requires an incoming relationship.

Example Use Cases Involving Other Components

Multiprocessor Use Case 1

Retrieve all files in an Azure DataLake Storage directory

Components Involved

  • ListAzureDataLakeStorage
    1. The "Filesystem Name" property should be set to the name of the Azure Filesystem (also known as a Container) that files reside in. If the flow being built is to be reused elsewhere, it's a good idea to parameterize this property by setting it to something like #{AZURE_FILESYSTEM}.
    2. Configure the "Directory Name" property to specify the name of the directory in the file system. If the flow being built is to be reused elsewhere, it's a good idea to parameterize this property by setting it to something like #{AZURE_DIRECTORY}.
    3. The "ADLS Credentials" property should specify an instance of the ADLSCredentialsService in order to provide credentials for accessing the filesystem.
    4. The 'success' Relationship of this Processor is then connected to FetchAzureDataLakeStorage.
  • FetchAzureDataLakeStorage
    1. "Filesystem Name" = "${azure.filesystem}"
    2. "Directory Name" = "${azure.directory}"
    3. "File Name" = "${azure.filename}"
    4. The "ADLS Credentials" property should specify an instance of the ADLSCredentialsService in order to provide credentials for accessing the filesystem.

System Resource Considerations

This component does not specify system resource considerations.

See Also

DeleteAzureDataLakeStorage, ListAzureDataLakeStorage, PutAzureDataLakeStorage