top of page

Navigating and Optimizing Data Transfer Costs in AWS and Azure

  • Autorenbild: Vivien Pfeiffer & Thomas Krenbauer
    Vivien Pfeiffer & Thomas Krenbauer
  • vor 12 Minuten
  • 4 Min. Lesezeit
Introduction

Data transfer costs in cloud environments are notoriously intricate, presenting a significant challenge for optimization. Although many native tools and third-party advisors offer various cost optimization strategies, there are surprisingly few options for optimizing data transfer expenses specifically (see graphic below). This blog post aims to dissect the complexities of data transfer in AWS and Azure, providing detailed insights and comparisons between the two platforms. This blog uses examples of both cloud service providers (Azure and AWS) only showing examples of both if there are significant differences.  


Figure 1: Tool Comparison
Figure 1: Tool Comparison

Let's Start with Some Basics 

Understanding how data transfer costs are generated is crucial for effective optimization. In cloud ecosystems, costs are primarily influenced by the path data travels through the provider's network. While it's true that the further data travels, the higher the costs, the real determining factor is the number of "borders" within the cloud service provider that data crosses.


Figure 2: Borders
Figure 2: Borders

These “borders” include Availability Zones, Regions, and VNETs (in Azure) or VPCs (in AWS). Crossing each of these borders can incur additional costs: 

  • Within an Availability Zone: Data transfers are generally free if private IPs are employed. This makes intra-AZ transfers cost-effective. 

  • Data Source and Destination: The charges typically apply to data leaving a service rather than entering it. This means the costs are often associated with data egress. 


Figure 3: Cost Borders
Figure 3: Cost Borders

In the following sections, we will explore three strategies designed to effectively reduce data transfer costs. 



Optimization Opportunity 1: Compress Data 

One of the most straightforward ways to reduce data transfer costs is by compressing data before it is transferred. This strategy can lead to cost savings of around 50%. However, this can vary based on the type of data that is transferred. By reducing the volume of data, not only does the transfer become faster, but storage costs are also reduced. Cloud providers often offer automatic compression for certain use cases, particularly when managed services are utilized. 

In this example, 50TB of data is transferred from US East 2 in Azure or North Virginia in AWS to the internet. If it is assumed that the data can be compressed by 50% new resulting cost for the data transfer is therefore cut in half. The calculation below considers the tiering pricing structure but neglects the free tier and compute power needed to (de-)compress the data.


Figure 3: Calculation without and with 50% Compression
Figure 3: Calculation without and with 50% Compression


Optimization Opportunity 2: Use a Content Delivery Network (CDN) 

CDNs operate like distribution centers, caching content at various strategic locations worldwide. This method offers a multitude of advantages: 

  • Global Availability: Content is readily accessible across the globe through numerous Points of Presence (PoPs). 

  • Improved Speed: Data is delivered more rapidly to end-users, enhancing the user experience. 

  • Enhanced Security: Leading cloud providers offer robust security measures, including HTTPS, DDoS protection, and advanced access controls. 

The visual below illustrates the functionality of a CDN on basis of AWS – however the functionality is the same for Azure.   

  1.  The user visits a website or application and requests an object. 

  2. DNS directs the request to the nearest CloudFront edge location, normally based on latency. 

  3. CloudFront checks its cache. If the object is cached, it is immediately returned to the user. 

    1. If the object is not cached, CloudFront forwards the request to the origin server, such as an Amazon S3 bucket or HTTP server. 

    2. The origin server responds with the requested object, which is sent to the edge location. 

    3. As soon as CloudFront starts receiving the object, it begins delivering it to the user and stores a copy in the cache for future requests. 


Figure 4: CDN
Figure 4: CDN

By caching data at PoPs, the bandwidth demand on the original server is reduced. Consequently, transferring data from these PoPs is more cost-effective than direct transfers from the original server.  

In this calculation example, 25TB of data is requested by different end-users. Since most of the data is cached in the closest PoP data can be provided faster and from a closer location. For this reason, data transfer cost can be decreased. The calculation assumes an 80% hit rate, which means that 80% of the data is already cached in the PoP. 


Figure 5: Calculation without and with CDN
Figure 5: Calculation without and with CDN


Optimization Opportunity 3: Use Gateway/Service Endpoints 

Implementing Gateway Endpoints (in AWS) or Service Endpoints (in Azure) establishes a secure connection using private IPs. This approach eliminates the need for Network Address Translation (NAT) and reduces the complexity of firewall configurations, making it a streamlined solution. Most importantly, data transferred through these endpoints is free of charge.  

  • Limitations

    • This method can only be used within the same region. 

    • In Azure, it is applicable only for data transfers to most storage or databases solutions. 

    • In AWS, this applies to data transfers towards S3 or DynamoDB services. 


In this calculation example 25TB need to be transferred from a VM to a storage service within the same region. Due to the use of Gateway/Service Endpoints the data remains within the same network and is not transferred via the Internet. Therefore no costs occur for this data transfer.  


Figure 6: Calculation without and with Endpoints
Figure 6: Calculation without and with Endpoints


Summary and Conclusion 

Effectively managing and reducing data transfer costs requires a strategic approach. Here are the key takeaways: 


  • Compress Data: Implement data compression to significantly reduce transfer volumes and associated costs. 

  • Localize Data: Keep data transfers within your network as much as possible to avoid unnecessary expenses. 

  • Minimize Transfer Distances: Always aim to transfer data over the shortest possible distances to optimize costs. 


By adopting these strategies, organizations can achieve substantial cost reductions in their cloud operations, enhancing both efficiency and profitability. 


 
 
 

Kommentare


bottom of page