You can consider using two Azure services for your scenario of downloading multiple CSV files, parsing them, transforming the data, and tracking the success or failure of processing:
#1. Storage Queue with Azure Functions:
- Azure Blob Storage can be used to store the CSV files, and a Storage Queue can manage the processing workflow.
- Set up an Azure Function with a queue trigger to trigger the function for processing a CSV file whenever a new message arrives in the queue.
- Implement the parsing, transformation, and writing logic for each file within the function.
- Track the success or failure of processing by writing the status or any error information to another storage location, such as a separate blob container or a database.
- To enable retries, configure the Storage Queue with a visibility timeout. Messages that are not deleted after processing become visible again after a specified duration, allowing for automatic retries.
#2. Azure Batch with Spot VMs:
- Azure Batch, a managed service, enables you to run large-scale parallel and batch computing jobs.
- Create an Azure Batch job that defines the tasks for downloading, parsing, transforming, and writing the CSV files.
- Utilize Azure Spot VMs, which are low-priority virtual machines available at a significantly reduced price, to handle large workloads cost-effectively.
- Azure Batch provides a mechanism to track task execution and the overall job status. Retrieve information on the success or failure of each task and programmatically handle retries if necessary.
The choice between these approaches depends on factors such as the complexity of the processing logic, workload scale, and specific requirements of your use case.
No comments:
Post a Comment