FetchAzureDataLakeStorage
Description
Fetch the specified file from Azure Data Lake Storage
Tags
adlsgen2, azure, cloud, datalake, microsoft, storage
Properties
In the list below required Properties are shown with an asterisk (*). Other properties are considered optional. The table also indicates any default values, and whether a property supports the NiFi Expression Language.
Display Name | API Name | Default Value | Allowable Values | Description |
---|---|---|---|---|
ADLS Credentials * | adls-credentials-service | Controller Service: ADLSCredentialsService Implementations: ADLSCredentialsControllerServiceLookup ADLSCredentialsControllerService | Controller Service used to obtain Azure Credentials. | |
Filesystem Name * | filesystem-name | Name of the Azure Storage File System (also called Container). It is assumed to be already existing. Supports Expression Language, using FlowFile attributes and Environment variables. | ||
Directory Name * | directory-name | Name of the Azure Storage Directory. The Directory Name cannot contain a leading '/'. The root directory can be designated by the empty string value. In case of the PutAzureDataLakeStorage processor, the directory will be created if not already existing. Supports Expression Language, using FlowFile attributes and Environment variables. | ||
File Name * | file-name | ${azure.filename} | The filename Supports Expression Language, using FlowFile attributes and Environment variables. | |
Range Start | range-start | The byte position at which to start reading from the object. An empty value or a value of zero will start reading at the beginning of the object. Supports Expression Language, using FlowFile attributes and Environment variables. | ||
Range Length | range-length | The number of bytes to download from the object, starting from the Range Start. An empty value or a value that extends beyond the end of the object will read to the end of the object. Supports Expression Language, using FlowFile attributes and Environment variables. | ||
Number of Retries | number-of-retries | 0 | The number of automatic retries to perform if the download fails. Supports Expression Language, using FlowFile attributes and Environment variables. | |
Proxy Configuration Service | proxy-configuration-service | Controller Service: ProxyConfigurationService Implementations: StandardProxyConfigurationService | Specifies the Proxy Configuration Controller Service to proxy network requests. Supported proxies: SOCKS, HTTP In case of SOCKS, it is not guaranteed that the selected SOCKS Version will be used by the processor. |
Dynamic Properties
This component does not support dynamic properties.
Relationships
Name | Description |
---|---|
failure | Files that could not be written to Azure storage for some reason are transferred to this relationship |
success | Files that have been successfully written to Azure storage are transferred to this relationship |
Reads Attributes
This processor does not read attributes.
Writes Attributes
Name | Description |
---|---|
azure.datalake.storage.errorCode | The Azure Data Lake Storage moniker of the failed operation |
azure.datalake.storage.errorMessage | The Azure Data Lake Storage error message from the failed operation |
azure.datalake.storage.statusCode | The HTTP error code (if available) from the failed operation |
State Management
This component does not store state.
Restricted
This component is not restricted.
Input Requirement
This component requires an incoming relationship.
Example Use Cases Involving Other Components
Multiprocessor Use Case 1
Retrieve all files in an Azure DataLake Storage directory
Components Involved
- ListAzureDataLakeStorage
- The "Filesystem Name" property should be set to the name of the Azure Filesystem (also known as a Container) that files reside in. If the flow being built is to be reused elsewhere, it's a good idea to parameterize this property by setting it to something like
#{AZURE_FILESYSTEM}
. - Configure the "Directory Name" property to specify the name of the directory in the file system. If the flow being built is to be reused elsewhere, it's a good idea to parameterize this property by setting it to something like
#{AZURE_DIRECTORY}
. - The "ADLS Credentials" property should specify an instance of the ADLSCredentialsService in order to provide credentials for accessing the filesystem.
- The 'success' Relationship of this Processor is then connected to FetchAzureDataLakeStorage.
- The "Filesystem Name" property should be set to the name of the Azure Filesystem (also known as a Container) that files reside in. If the flow being built is to be reused elsewhere, it's a good idea to parameterize this property by setting it to something like
- FetchAzureDataLakeStorage
- "Filesystem Name" = "${azure.filesystem}"
- "Directory Name" = "${azure.directory}"
- "File Name" = "${azure.filename}"
- The "ADLS Credentials" property should specify an instance of the ADLSCredentialsService in order to provide credentials for accessing the filesystem.
System Resource Considerations
This component does not specify system resource considerations.
See Also
DeleteAzureDataLakeStorage, ListAzureDataLakeStorage, PutAzureDataLakeStorage