S3 | tecRacer Amazon AWS Blog

27 Aug '24

Enabling Apache Airflow to copy large S3 objects

If you’re trying to use Apache Airflow to copy large objects in S3, you might have encountered issues where S3 complains about you sending an InvalidRequest. We will fix that in this post by writing a custom operator to handle the underlying problem.

Read Blog

07 May '24

How to migrate data from Amazon EFS to Amazon S3 with AWS DataSync

Written by Franck Awounang Nekdem

AWS DataSync is a service that simplifies and accelerates data migrations not only to but also from and between AWS storage services. In this blog post we will see how to leverage it to migrate data from an EFS file system to an Amazon S3 bucket.

Read Blog

02 Apr '24

Build a Serverless S3 Explorer with Dash

Written by Maurice Borgmeier

Many projects get to the point where your sophisticated infrastructure delivers reports to S3 and now you need a way for your end users to get them. Giving everyone access to the AWS account usually doesn’t work. In this post we’ll look at an alternative - we’re going to build a Serverless S3 Explorer with Dash, Lambda and the API Gateway.

Read Blog

30 May '23

Build Terraform CI/CD Pipelines using AWS CodePipeline

Written by Hendrik Hagen

When deciding which Infrastructure as Code tool to use for deploying resources in AWS, Terraform is often a favored choice and should therefore be a staple in every DevOps Engineer’s toolbox. While Terraform can increase your team’s performance quite significantly even when used locally, embedding your Terraform workflow in a CI/CD pipeline can boost your organization’s efficiency and deployment reliability even more. By adding automated validation tests, linting as well as security and compliance checks you additionally ensure that your infrastructure adheres to your company’s standards and guidelines. In this blog post, I would like to show you how you can leverage the AWS Code Services CodeCommit, CodeBuild, and CodePipeline in combination with Terraform to build a fully-managed CI/CD pipeline for Terraform.

Read Blog

11 Apr '23

Push-Down-Predicates in Parquet and how to use them to reduce IOPS while reading from S3

Written by Maurice Borgmeier

Working with datasets in pandas will almost inevitably bring you to the point where your dataset doesn’t fit into memory. Especially parquet is notorious for that since it’s so well compressed and tends to explode in size when read into a dataframe. Today we’ll explore ways to limit and filter the data you read using push-down-predicates. Additionally, we’ll see how you can do that efficiently with data stored in S3 and why using pure pyarrow can be several orders of magnitude more I/O-efficient than the plain pandas version.

Read Blog

12 Feb '23

Building an AWS Lambda Telemetry API extension for direct logging to Grafana Loki

Written by Gernot Glawe

In hybrid architectures, serverless functions work together with container solutions. Lambda logs have to be translated when you don`t choose CloudWatch Logs. The old way of doing this is through subscription filters using additional Lambda functions for log transformation. With the Lambda Telemetry API there is a more elegant, performant and cost-effective way. I am using Grafana Loki as a working example and show you how to build a working Lambda-Loki Telemetry APi extension.

Read Blog

13 Jan '23

What are the folders in the S3 console?

Written by Maurice Borgmeier

When you start out learning about S3, the experts and documentation will tell you that you should think of S3 as a flat key-value store that doesn’t have any hierarchical structure. Then you go ahead and create your first S3 bucket in the console, and what the interface shows you is a nice big “Create Folder” button. You may be justifiably confused - didn’t I just learn that there are no folders, directories, or hierarchy in S3?

Read Blog

18 Dec '22

Serverless Spy Vs. Spy Chapter 3: X-Ray vs Jaeger - Send Lambda traces with open telemetry

Written by Gernot Glawe

In modern architectures, Lambda functions co-exist with containers. Cloud Native Observability is achieved with open telemetry. I show you how to send open telemetry traces from Lambda to a Jaeger tracing server. Let’s see how this compares to the X-Ray tracing service.

Read Blog

Articles tagged with "s3"