What are the folders in the S3 console?

When you start out learning about S3, the experts and documentation will tell you that you should think of S3 as a flat key-value store that doesn’t have any hierarchical structure. Then you go ahead and create your first S3 bucket in the console, and what the interface shows you is a nice big “Create Folder” button.

S3 Console Create Folder

You may be justifiably confused - didn’t I just learn that there are no folders, directories, or hierarchy in S3? The Upload dialog will even allow you to upload folders. If you upload a folder, the console will display that as a directory structure.

S3 Upload Dialog

So… have you been lied to? Are there actually folders in S3? The answer is: it depends. If we’re talking about the S3 API that we use to programmatically interface with the service, the only terminology you will see are buckets, objects, and sometimes prefixes. On the API level, there’s no such thing as a folder.

The S3 console is another matter. It’s an abstraction of the underlying API. It tries to simplify the service by making it behave like something that users are already familiar with: filesystems and hierarchical directory structures. The console will render keys that contain slashes as directories that you can navigate. This can be pretty convenient if you’re looking for something in the console because navigating a flat structure with lots of objects isn’t that fun.

So what happens when you create a new empty folder in the console?

S3 Create Folder Dialog

Well, the S3 console can only do what the API allows it to do, so it creates an Object to represent that folder. It has the special rule that it must end with a / and can’t contain slashes. This rule is enforced by the console. We can find out what kind of object it created using the HeadObject API through the AWS CLI:

$ aws s3api head-object --bucket dummy-bucket-for-my-blogpost-2023 --key 'this_is_a_"folder"/' --no-cli-pager
{
    "AcceptRanges": "bytes",
    "LastModified": "2023-01-13T10:46:24+00:00",
    "ContentLength": 0,
    "ETag": "\"d41d8cd98f00b204e9800998ecf8427e\"",
    "ContentType": "application/x-directory; charset=UTF-8",
    "ServerSideEncryption": "AES256",
    "Metadata": {}
}

As you can see, it’s a 0-byte object with the special content type application/x-directory that the console uses to identify this object’s purpose.

Naturally, this leads to the question: can we break it? Let’s try to create an object with the application/x-directory content type that doesn’t end with a slash:

$ aws s3api put-object --bucket dummy-bucket-for-my-blogpost-2023 --key fake-bucket --content-type 'application/x-directory; charset=UTF-8' --no-cli-pager
{
    "ETag": "\"d41d8cd98f00b204e9800998ecf8427e\"",
    "ServerSideEncryption": "AES256"
}

It turns out that setting the content type is not enough for the console to be confused - it sees an ordinary object(Yes, it should have been fake-folder, but I was too lazy to fix it).

S3 Fake Folder I can already hear your next question: What happens if we upload an object with a slash at the end and the application/x-directory content type? I’m glad you asked.

$ echo "Hello World" > hello_world.txt
$ aws s3api put-object --bucket dummy-bucket-for-my-blogpost-2023 --key 'fake-folder/' --content-type 'application/x-directory; charset=UTF-8' --body hello_world.txt --no-cli-pager
{
    "ETag": "\"e59ff97941044f85df5297e1c302d260\"",
    "ServerSideEncryption": "AES256"
}

The upload worked, and in the console, we can now see a fake-folder/ that acts like a “regular” console folder. The console doesn’t even allow me to download the object directly.

S3 Fake Folder Confusion

It doesn’t even display the object size, although Head-Object tells us it should be 12 bytes. Only if we navigate into the folder can we see a /-object with a size of 12 bytes which we could then download. If we say a file in a regular filesystem that had the name /, eyebrows would be raised.

This is what’s commonly referred to as a leaky abstraction. The S3 console tries to visualize something as a directory structure that isn’t one. Is this a problem? I’d argue that it isn’t that big of a deal because, in most cases, it’s a useful abstraction that makes your life easier, and that’s what counts.

(I also tested uploading an object with a slash-suffix without the content type, the behavior was identical to the one with the content type.)

What do we learn from this? S3 is a flat key-value store even though the console makes some effort to abstract that reality away from us. The documentation also explains how this is achieved.

Hopefully, you’ve learned something from this explanation.

— Maurice

Similar Posts You Might Enjoy

Serverless Spy Vs. Spy Chapter 3: X-Ray vs Jaeger - Send Lambda traces with open telemetry

In modern architectures, Lambda functions co-exist with containers. Cloud Native Observability is achieved with open telemetry. I show you how to send open telemetry traces from Lambda to a Jaeger tracing server. Let’s see how this compares to the X-Ray tracing service. - by Gernot Glawe

Serverless Spy Vs. Spy Chapter 2: AWS Distro for OpenTelemetry Lambda vs X-Ray SDK

We know how to follow traces with the X-Ray SDK. Now there is AWS Distro for OpenTelemetry claiming to do this better. Let’s build CDK examples for Lambda with TypeScript/Python/Go and find out who is the better spy in this game. - by Gernot Glawe

Serverless Spy Vs Spy Chapter 1: X-ray

There are several ways to perform espionage activities in the life of a serverless app, which all battle for your attention. Time for the advent of counterintelligence: We want answers! - And CDK/Source examples of how to use it! Here we go, Serverless spy vs spy in four chapters, each post published after you light the next candle. - by Gernot Glawe