All posts by Kevin S. Ridolfi

How BASF’s Agriculture Solutions drives traceability and climate action by tokenizing cotton value chains using Amazon Managed Blockchain

Post Syndicated from Kevin S. Ridolfi original https://aws.amazon.com/blogs/architecture/how-basfs-agriculture-solutions-drives-traceability-and-climate-action-by-tokenizing-cotton-value-chains-using-amazon-managed-blockchain/

BASF Agricultural Solutions combines innovative products and digital tools with practical farmer knowledge. With over a century of experience, BASF offers a broad portfolio spanning seeds, crop protection, soil management, plant health, and digital agriculture solutions. Through collaboration with farmers, scientists, and partners, BASF strives to meet societal needs sustainably while creating a lasting agricultural legacy. Infosys is a global premier consulting and managed services partner of Amazon Web Services (AWS). Through this unique partnership, AWS helps customers integrate software, services, and processes to accelerate business transformation. This post explores the commitment of this partnership to driving positive change in the agricultural industry by using Amazon Managed Blockchain to tokenize food and cotton value chains for traceability, climate action, and circularity.

Global challenges and the agricultural industry

The world’s population is growing, with the UN projecting an estimated world population of 8.5 billion in 2030 and 10.4 billion by the end of the century. Along with this growth, as well as a global increase in standards of living, comes a rising demand for agricultural products such as fiber and food crops. At the same time, as society becomes more aware of the ecological impact of agriculture, both local communities and farmers are placing larger focus on a sustainable management of natural resources. The agricultural industry is uniquely positioned at the intersection of these two trends.

The agricultural industry faces numerous complex challenges that span both business and technical domains. From a business perspective, today’s agricultural supply chains have become incredibly complex, often involving multiple intermediaries across different countries. This complexity makes it difficult to ensure fair pricing and adequate compensation for farmers, who are often at the bottom of the value chain. Furthermore, verifying sustainable farming practices and organic certifications has become increasingly challenging, even as consumer demand for product authenticity and sustainability information continues to grow. Adding to these pressures, agricultural businesses must navigate increasing regulatory requirements for environmental regulation compliance and reporting, along with complex international trade regulations and documentation.

On the technical front, the industry struggles with limited digital infrastructure in rural farming areas, where internet connectivity and technology adoption remain significant hurdles. Data collection methods vary widely across different farms and regions, making it difficult to establish consistent metrics and reporting standards. Many agricultural businesses still operate with legacy systems that resist integration with modern tracking solutions, and the lack of standardization in agricultural data formats creates additional complications. Maintaining data integrity across multiple stakeholders has proven particularly challenging, as has the implementation of real-time tracking and tracing capabilities.

Cotton and fast fashion: Industry`s challenges

Cotton is the world’s most important natural fiber crop, with a yearly production of 126.5 million bales in the 2022–2023 season, enough to produce 25.3 billion pairs of jeans or 151.8 billion T-shirts. It also plays a major role in the fast fashion industry, where garments and clothing undergo a fast production and disposal cycle to quickly address customer attention and the latest fashion trends, with around 30% of clothing sold in the US being made with cotton. This accelerated production and disposal cycle comes at the expense of considerable environmental impact, with the fast fashion industry accounting for approximately 20% of the world’s water consumption and 10% of the world’s total CO2 emissions.

The cotton industry faces its own set of distinct challenges. Water usage stands as one of the most pressing concerns, with a single cotton T-shirt requiring approximately 2,700 liters of water to produce. Chemical usage tracking presents another significant challenge, as stakeholders must carefully monitor pesticide and fertilizer application throughout the growing process. Labor practices verification has become increasingly important, with brands and consumers demanding assurance of ethical working conditions throughout the supply chain.

Quality verification poses another crucial challenge, given that maintaining accurate documentation of cotton grade and characteristics is essential for pricing and processing. The industry’s global nature creates additional complexities in cross-border logistics, requiring careful management of international shipping and customs processes. Furthermore, the growing importance of sustainability certification has created new pressures to validate organic and sustainable farming practices with reliable, transparent documentation.

As consumer expectation of guaranteed fair practices, lower carbon emissions, and sustainable use of natural resources grows, so does the demand for traceability systems that can provide near real-time visibility into each step of the value chain by tracking sustainability information such as water consumption and CO2 emissions.

The potential for a blockchain-based solution

To address this demand, BASF identified blockchain as a foundational technology for a digital solution to deliver transparency along the value chain, targeting specific customer requirements for digital assets backed by information, validation, certificates, and know your business (KYB) policies for value chain partners.

Blockchain technology emerges as a particularly powerful solution to these challenges, offering unique capabilities that directly address many of the industry’s pain points. At its core, blockchain provides immutable record-keeping, creating permanent, tamper-proof records of transactions and events that ensure data integrity throughout the supply chain. This feature proves especially valuable in preventing fraudulent modification of sustainability certificates and maintaining the credibility of organic farming claims.

Smart contracts, a key feature of blockchain technology, enable the automation of compliance with agricultural standards and facilitate automatic payment execution based on predefined conditions. This automation significantly reduces administrative overhead in supply chain management and helps ensure fair compensation for farmers.

The technology’s traceability capabilities provide end-to-end visibility of cotton from seed to garment, enabling real-time tracking of sustainability metrics and creating transparent audit trails for certification purposes. This transparency helps brands and consumers verify the authenticity and sustainability of their cotton products while enabling farmers to demonstrate their commitment to sustainable practices.

Blockchain’s decentralized data management allows multiple stakeholders to maintain shared records without requiring a central authority, eliminating single points of failure in data storage and reducing dependency on central authorities. This decentralized approach proves particularly valuable in agricultural supply chains, where numerous parties need to access and verify information.

The implementation of token economics through blockchain creates new opportunities for incentivizing sustainable farming practices. Through tokenization, farmers can access new revenue streams, including carbon credits, while establishing more direct relationships with buyers. Additionally, blockchain’s digital identity capabilities provide secure authentication for supply chain participants, enabling granular access control to sensitive data and facilitating compliance with know your customer (KYC) and KYB requirements.

Solution overview

Using a permissioned blockchain based on open-source systems, BASF Agricultural Solutions has developed a novel way to promote data democratization and address the challenges of data recording, off-chain processes, and on-chain activities at scale. The solution enables value chain players to independently verify activities progressively, and an organizational structure within chain and off-chain monitors key performance indicators (KPIs) through a DAO (Distributed Autonomous Organization) interface.

To focus on building such a system rather than managing the underlying blockchain infrastructure, BASF selected Amazon Managed Blockchain alongside additional AWS services. Amazon Managed Blockchain simplifies BASF’s approach because it brings a suite of offerings that can be configured to build this solution without the need to add more layers and external or internal sources.

As a foundational system, Amazon Managed Blockchain augments the solution’s ability to generate smart certificates along with off-chain opportunities to further expand the offering as a platform, such as with AI and AWS Lambda. This fits into BASF’s vision to deliver best-in-class solutions for the farming community and deliver trusted information to communities that want to drive a positive impact for the planet.

The following are the key structural components of the solution:

  • Peers – These blockchain nodes run smart contracts (chain code) and maintain the ledger.
  • Ordering service – The ordering service makes sure a transaction meets the consensus requirements based on configured channel and endorsement policies for the installed chain code.
  • Fabric certificate authority (CA) – This component enrolls and generates blockchain identities needed to sign transactions.
  • AWS services – The solution uses various AWS services to perform operations on the blockchain efficiently. These services include:
    • Amazon Cognito – We use Amazon Cognito to onboard external users and clients to the platform.
    • AWS Fargate – A block listener is a custom service that listens to every block event from the blockchain and updates the off-chain storage accordingly. It’s hosted as a container on Fargate. Running a container using Fargate is more straightforward than other Kubernetes services because you don’t have to manage servers or clusters of Amazon Elastic Compute Cloud (Amazon EC2) instances. With Fargate, we no longer have to provision, configure, or scale clusters of virtual machines to run containers.
    • AWS Lambda – Middleware services are hosted as Lambda functions, which makes sure the services are automatically scalable by default and cost-efficient. This is important because we’re charged based on the number of requests for the function and the time it takes for the code to run.
    • Amazon OpenSearch Service – We use OpenSearch Service as an off-chain data store because the solution requires complex queries to aggregate the ledger data. The off-chain storage is kept in sync with the ledger and is restricted for direct updates. It can be updated only by an authorized application, based on ledger events.
    • AWS Secrets Manager – We use Secrets Manager to manage blockchain identities.
    • Amazon Simple Notification Service (Amazon SNS) – We use Amazon SNS to connect various services asynchronously.

The following diagram illustrates the solution architecture.

The solution architecture is extensible and scalable to meet the dynamic load requirements. It can seamlessly connect various data sources with appropriate connectors such as Salesforce, mobility platforms, third-party services, and more.

External users such as value chain players, retailers, and others who could benefit from tokens can access the platform through different methods. Generally, access of DAOs is done through business-to-customer (B2C) login, and API streams can be subscribed by end retailers for checkouts, point of sale (POS), and so on. Additionally, we provide internal access for admins and auditors to visualize the product flows.

Conclusion

Climate challenges are quite complex and require a joint approach between technology, the custodians of our planet (namely the farmers), and public chains that deliver the right protocols. BASF Agriculture Solutions represents the farming needs and the link to the right communities and crops on the ground, AWS brings in the right infrastructure and support of the cloud and scale, and Infosys brings in development support as a partner to both AWS and BASF.

BASF is connected to millions of farmers. BASF considers farming to be the biggest job on earth. Sustainable farming means bringing back lost biodiversity and increasing carbon capture within the soil. And sustainability overall requires additional effort by the farmers. Additionally, consumers like us make choices daily when it comes to our own purchase decisions, such as to buy sustainable products or take action that brings positive impact to the climate.

The solution outlined in this post creates a solution using blockchain as the base technology to enable a secure and reliable method for information sharing across all stakeholders. It’s the baseline to onboard use cases in the agriculture industry to enable end-to-end traceability with a 360-degree view. Smart contracts incentivize farmers and other stakeholders to follow the sustainable measures based on the information in the system, which is reviewed and authorized by validators. All the actions in the system are monitored and logged as immutable records, which enforce the information trust by default. This acts as a baseline for 100% traceability, tokenization for sustainable measures, and digital assets that can be exchanged and create a positive economy around sustainability. The design discussed in this post is flexible to onboard different use cases and can auto scale to meet dynamic data volumes.

We encourage you to join BASF, Infosys, and AWS in driving sustainability through trusted value chains that incentivize farmers, empower consumers, and create a positive economy around climate action. If you want to dive deep into topics surrounding sustainability and AWS architecture, we suggest visiting the AWS Architecture Blog.


About the Authors

BASF Digital Farming builds a STAC-based solution on Amazon EKS

Post Syndicated from Kevin S. Ridolfi original https://aws.amazon.com/blogs/architecture/basf-digital-farming-builds-a-stac-based-solution-on-amazon-eks/

This post was co-written with Frederic Haase and Julian Blau with BASF Digital Farming GmbH.

At xarvio – BASF Digital Farming, our mission is to empower farmers around the world with cutting-edge digital agronomic decision-making tools. Central to this mission is our crop optimization platform, xarvio FIELD MANAGER, which delivers actionable insights through a range of geospatial assets, including satellite imagery, drone data, and application maps from sprayers.

In this post, we show you how we built a scalable geospatial data solution on AWS to efficiently catalog, manage, and visualize both raster and vector datasets through the web. We walk you through our solution based on the SpatioTemporal Asset Catalog (STAC) specification and the open source eoAPI ecosystem, detailing the solution architecture, key technologies, and lessons learned during deployment. This builds upon a previous post on efficient satellite imagery ingestion using AWS Serverless, extending our discussion to the full lifecycle of geospatial data management at scale.

Requirements for our geospatial data solution

BASF Digital Farming’s xarvio FIELD MANAGER platform operates at exceptional scale in the geospatial data ecosystem, processing hundreds of millions of satellite images that translate into STAC items, which further decompose into billions of individual geospatial artifacts. Unlike traditional satellite data providers such as European Space Agency (ESA) who work with predictable, structured data flows, we operate in an inherently dynamic agricultural environment where we ingest near-daily satellite imagery per field from a diverse array of sensors and providers globally. Our mission to support farmers worldwide with advanced digital agronomic decision advice demands a reliable, cloud-based infrastructure capable of handling this massive data velocity and volume and applying advanced quality assurance processes including cloud detection and anomaly detection algorithms. The platform’s true value emerges through our machine learning (ML) pipelines that transform raw satellite data into actionable insights. For example, estimating accurate absolute biomass such as Leaf Area Index (LAI) helps farmers make precise, data-driven agronomic decisions that optimize crop yield and resource utilization across fields worldwide.

STAC and eoAPI ecosystem

To efficiently manage our growing archive of geospatial data, we adopted the Spatio Temporal Asset Catalog (STAC) specification, an open standard that provides a common language to describe and catalog raster and vector datasets. With STAC, we can standardize metadata across diverse sources like satellite imagery, UAV datasets, and prescription maps, making it straightforward to search, filter, and retrieve assets across our platform. We built our platform using the eoAPI ecosystem, an integrated suite of open source tools designed to handle the full lifecycle of geospatial data on the cloud. At its core is pgSTAC, which provides a performant PostGIS-backed STAC API implementation. With pgSTAC, we can index millions of STACi Items efficiently, with support for spatial, temporal, and attribute-based filtering at scale. On top of that, we use Tiles in PostGIS (TiPG) to serve tiled vector data directly from our PostGIS database. This enables real-time visualization of field boundaries, management zones, and application histories as lightweight Mapbox Vector Tiles (MVT), without requiring an external tile server. For raster assets, including satellite and drone imagery, we rely on TiTiler, a modern dynamic tile server built for Cloud Optimized GeoTIFFs (COGs). With TiTiler, we can stream imagery on-demand as WMTS or XYZ tiles, perform dynamic rendering (such as NDVI or false color composites), and integrate seamlessly into web maps and mobile apps.

Solution overview

The following architecture diagram shows how we implemented our geospatial data platform on AWS. In this section, we explain each component of the architecture and how they work together to process millions of satellite images and geospatial assets daily. The solution uses Amazon Elastic Kubernetes Service (Amazon EKS) as the core computing platform, with Amazon Simple Storage Service (Amazon S3) for storage and Amazon Relational Database Service (Amazon RDS) for metadata management. We break down the architecture into four main layers: core services, storage, database, and ingestion.

A detailed AWS Cloud architecture visualization showcasing a complete geospatial data processing system across four distinct layers. The database layer features an EKS Cluster managing STAC, raster, and vector services, all connected to Amazon RDS through a proxy instance. The client layer supports both desktop and mobile access via Amazon API Gateway. The ingestion layer processes geospatial data streams through a STAC ingestor, feeding into a robust storage layer utilizing Cloud Optimized GeoTIFF and FlatGeobuf technologies. The architecture emphasizes scalability and efficient spatial data handling through PostgreSQL with pgstac extension, enabling seamless integration of various geospatial services and data formats.

Core services layer

The solution uses an EKS cluster hosting three key services:

  • stac-service – Implements the STAC API specification to catalog and serve metadata for both raster and vector datasets
  • raster-service – Powered by TiTiler, this service dynamically renders and tiles cloud-optimized raster data (for example, COGs) for seamless integration into web and mobile maps
  • vector-service – Built with TiPG, this component serves vector data (for example, boundaries or application zones) as tiled MVT layers directly from the database or from Amazon S3

These services are containerized and orchestrated within Kubernetes, allowing for high availability, modular separation, and simplified continuous integration and delivery (CI/CD) workflows.

KEDA-based automatic scaling

We use Kubernetes Event-Driven Autoscaling (KEDA) to scale our platform services dynamically based on real-time workloads. With KEDA, we can scale individual pods based on precise event-driven metrics such as the STAC ingestion queue depth or visualization request load. This supports responsive performance during peak activity while maintaining lean resource usage during idle periods, aligning perfectly with our need for elasticity in a data-intensive, variable-load environment.

Geospatial asset storage layer

The platform stores all raw and processed geospatial assets in S3 buckets, optimized for performance and durability. This layer holds COGs for raster imagery and FlatGeobuf or similar formats for vector data. These formats are chosen for their support of streaming access, indexing, and cloud-based performance.

Database layer

The metadata backbone of the system is a PostgreSQL database hosted on Amazon RDS, extended with the pgSTAC plugin. This setup enables efficient indexing and querying of millions of STAC items and collections. An RDS proxy sits in front of the database, providing connection pooling and resiliency, especially under bursty or concurrent access patterns common in geospatial applications.

Ingestion layer

An independent ingestion component handles batch or streaming geospatial data inputs. This component processes satellite imagery, drone data, or prescription maps and pushes relevant metadata into the STAC API and storage assets into Amazon S3. The ingestion engine is decoupled from serving infrastructure, enabling asynchronous and large-scale data loading.

Amazon API Gateway and clients

Public access to the platform is handled through Amazon API Gateway, allowing clients—whether browser-based or mobile—to interact securely with the services. The API gateway provides a unified entrypoint and is used for applying rate limiting, authorization, and routing policies.

Solution benefits

The solution offers the following benefits:

  • Rapid onboarding with STAC standardization – By aligning with the STAC specification, we’ve significantly reduced the time to onboard new data domains like sprayer application maps. Compared to previous approaches in our legacy system, metadata modeling and integration are now both standardized and automated, so we can expose new geospatial data products to clients in days instead of weeks or months.
  • Optimized storage with COGs and Amazon S3 – Storing raster and vector assets in Amazon S3 using cloud-optimized formats (such as COGs for imagery or FlatGeobuff for vectors) reduces storage costs while enabling low-latency, streaming access. This avoids the need for preprocessing or extract, transform, and load (ETL)-heavy pipelines and simplifies client delivery.
  • Large-scale ingestion with a batch STAC ingestor – Our custom STAC ingestor supports both real-time and batch-mode operations. This has made it possible to onboard satellite constellations, drone imagery, and historical datasets in bulk without disrupting running services. The ingestion service uses optimized database ingestion functions, capable of ingesting thousands of items per second, providing high-throughput and reliable data integration at scale.
  • PostgreSQL, pgSTAC, and Amazon RDS Proxy for a scalable metadata backbone – With pgSTAC and Amazon RDS Proxy, we benefit from advanced spatial-temporal querying while making sure database connection management is handled gracefully, even under high concurrency. This combination offers reliability without compromising performance.
  • Scalable deployment with Amazon EKS – Hosting the solution on Amazon EKS provides full control over deployments, resource tuning, and service orchestration. Combined with automatic scaling, we dynamically adjust compute capacity based on demand, facilitating resilience and cost-efficiency.

Learnings

As part of building this solution, we learned the following:

  • RDS Proxy is essential for automatically scaled environments – Given our use of automatic scaling pods in Amazon EKS, we found that RDS Proxy is critical. It handles connection pooling efficiently and protects the underlying PostgreSQL database from connection exhaustion during sudden scale-up events. Without it, we encountered spiky load failures and blocked connections during high-ingest periods.
  • Batch STAC ingestor is a core component – Our custom STAC ingestor proved to be an indispensable piece of the system. It interfaces directly with pgSTAC to perform large-scale, automated ingestions of geospatial metadata from streams and archives. Without this tool, onboarding data providers or processing legacy imagery at scale would have been labor-intensive and error-prone.
  • COGs are non-negotiable – For fast, scalable visualization of large raster datasets, COGs are essential, particularly if raster datasets exceed several gigabytes. They enable efficient HTTP range requests, alleviate the need for preprocessing, and work seamlessly with TiTiler for real-time tile rendering. Non-COG formats led to noticeably slower performance and weren’t suitable for cloud-based visualization.
  • Serverless-compliant, optimized for Amazon EKS (for now) – Although the architecture is designed to be serverless-compatible, we opted for an Amazon EKS first approach due to the nature of our other application landscape. Components like TiTiler and TiPG benefit from persistent, memory-tuned environments that are harder to achieve in a serverless runtime. However, the solution remains modular and stateless by design, and certain subsystems (such as ingestion triggers, notifications, or monitoring) are already candidates for future serverless migration to further improve elasticity and reduce operational overhead.

Conclusion

BASF Digital Farming GmbH has successfully implemented a STAC-based geospatial data platform on Amazon EKS, enabling efficient management and visualization of satellite imagery, drone data, and application maps. This architecture helps us onboard new data sources within weeks rather than months. The new platform also processes twice as much data in a single day while cutting costs by 50%, thanks to reduced data handling through the STAC schema and the efficiencies of automatic scaling. By adopting the STAC standard, the architecture improves data discoverability, reduces search latency, and supports more efficient analytic workflows.

Organizations looking to build similar geospatial data solutions can use AWS services like Amazon EKS, Amazon S3, and Amazon RDS along with open source tools like STAC and eoAPI to create scalable, cost-effective solutions. Learn more about building containerized applications on AWS at Containers on AWS.

Efficient satellite imagery supply with AWS Serverless at BASF Digital Farming GmbH

Post Syndicated from Kevin S. Ridolfi original https://aws.amazon.com/blogs/architecture/efficient-satellite-imagery-supply-with-aws-serverless-at-basf-digital-farming-gmbh/

This post was co-written with Dr. Jan Melchior at BASF Digital Farming GmbH and xarvio Digital Farming Solutions.

BASF Digital Farming’s mission is to support farmers worldwide with cutting-edge digital agronomic decision advice by using its main crop optimization platform, xarvio FIELD MANAGER. This necessitates providing the most recent satellite imagery available as quickly as possible. This blog post describes the serverless architecture developed by BASF Digital Farming for efficiently downloading and supplying satellite imagery from various providers to support its xarvio platform.

Screenshot showing the xarvio Field Manager platform

Figure 1. Screenshot showing the xarvio Field Manager platform

Architecture

Figure 2 shows the serverless architecture implemented with AWS services for downloading and processing satellite imagery. The subscription management components handle subscription creation, updates, and deletions, while the actual data downloading and processing occurs in AWS Step Functions.

Serverless implementation of the new imagery service

Figure 2. Serverless implementation of the new imagery service

  1. Subscriptions are created using Amazon API Gateway for external API access, which provides request throttling and can be used to manage API request authorizations.
  2. An AWS Lambda API function manages subscriptions. It implements common create, read, update, and delete operations with request validations and provides an endpoint for replaying failed requests. Subscriptions contain geometry, data provider, as well as start and end date and other parameters, which are stored in the subscription database (Step 7) before a message is sent out for processing.
    Notice that the entire architecture is serverless and thus allows for theoretically unbounded scaling. In case of a bug, this can lead to severe cost impacts, so we implemented a safety buffer, which enables us to prioritize and limit the number of Step Functions executions of the processing pipeline.
  3. All requests (such as the initial request for imagery when a subscription is created) are sent to the Amazon Simple Queue Service (Amazon SQS) processing queue first, which functions as a processing buffer and allows for request prioritization.
  4. Subsequently, Amazon EventBridge Pipes connects the processing buffer with AWS Step Functions. It handles pipe-internal errors automatically; for example, when the Step Functions concurrency limit is reached, the invocation will be retired automatically. This does not handle exceptions raised within Step Functions, such as runtime errors.
  5. AWS Step Functions then performs the actual downloading, processing, and ingestion to the STAC catalog of satellite data from different providers. In case of failure, the request message with error description is sent to the failure queue.
  6. Step Functions uploads the data to Amazon Simple Storage Service (Amazon S3), which stores satellite imagery data.
  7. Following this, Step Functions updates the subscriptions in the Amazon DynamoDB-based subscription database, which stores relevant metadata, such as start and end date, boundary, provider, collection, and last update.
  8. A notification is sent out to inform the user that new data is available through Amazon Simple Notification Service (Amazon SNS), which informs users and services about any updates on a subscription, such as new data being available or subscriptions having been created, deleted, updated, or having failed.
  9. Next, the data is published to our internal STAC catalog, which registers the satellite imagery and makes it directly accessible for subsequent processing.
  10. In case of failed Step Functions execution in Step 5, the Amazon SQS-based failure queue buffers failed executions. Failure messages contain the error message and request body. Depending on error reasons, they can be replayed using the corresponding API endpoint, enabling reprocessing through the replay endpoint on the API Lambda function. The endpoint also allows users to filter messages based on their failure type and to delete messages that cannot be replayed.
  11. An update checker, built on AWS Lambda, regularly checks whether a subscription can be updated. It is triggered in conjunction with an event scheduler every 5 minutes, checks the database for subscriptions that can be updated, and sends update request messages to the processing buffer. Besides actively checking resources, such as API endpoints and STAC catalogs, it also sends out an update message if a notification was received, for example, through an external notification service.
  12. Finally, a delete checker, also built on AWS Lambda, identifies subscriptions that can be deleted. It is triggered in conjunction with an event scheduler every 12 hours. It regularly checks the database for subscriptions that can be deleted and removes them from the database, the S3 bucket, and the STAC catalog. As a safety mechanism, a subscription will first be marked for deletion for 6 months before it gets deleted.

Imagery step function

The actual downloading and processing of data from different providers is handled by the imagery function, illustrated for two different providers (Public and Planet) in Figure 3.

Diagram showing detail state machine for the Imagery Step Function

Figure 3. Diagram showing detail state machine for the Imagery Step Function

  1. When a request arrives, the provider choice state determines the provider from the request body, depending on which the Step Functions flow routes to different Lambda states.
  2. In case a public provider is selected (for example, Earth Search), the Public_Provider Lambda function downloads the data from STAC-based open data providers and directly uploads it to the S3 data bucket, as shown in Figure 2.
  3. In case Planet data is selected, the data retrieval involves an asynchronous call to an external API: First, the Planet_Requester sends an order to the Planet API, together with a task token for pausing Step Functions and the URL of the Planet_Webhook Lambda function.
  4. The Planet_Webhook function is invoked by Planet when the requested order is available for downloading. Given the transmitted task token, Step Functions is resumed with the next state.
  5. Subsequently, the Planet_Provider Lambda function downloads and processes the Planet data.
  6. For both public providers and Planet, the subsequent Public_Provider Lambda function updates the subscription database entries, as shown in Figure 2 (for example, with the latest available timestamp), and adds the download and processed data to the internal STAC catalog, before it ends in the Success state.
  7. If an error occurs in any of the Lambda functions (2, 3, 5, 6), an error message is prepared in the Error_Parsing If an unknown provider is handed in, an error message, including the request body, is prepared in the Error_Provider_Unknown state. In both cases, the error message is pushed to the Failure_Queue (refer to #10 of Figure 2), before it ends in the Failure state.

Conclusion

BASF Digital Farming GmbH developed a serverless architecture on AWS for efficiently downloading and supplying satellite imagery for use by its xarvio platform. This architecture led to a 5x faster delivery rate, an 80% cost reduction through on-demand data downloading, and a 3x accelerated development cycle. Future work will include optimizing the architecture, exploring additional AWS services, and onboarding more satellite imagery providers. Similar serverless architectures using AWS services like AWS Step Functions, AWS Lambda, and Amazon API Gateway can enhance flexibility, scalability, and cost efficiency in imagery provisioning. Learn more about AWS serverless offerings at aws.amazon.com/serverless.