"S3 and Glacier"

Buckets : Global namespace [best practice to save name as organisation name].

Name with 63 lowercase number hyphens and periods.
100 buckets per account.
Created at a specific reason.
Bucket can store unlimited number of objects
Private by default

Objects: Entities or files n Amazon in buckets.

Virtually can store any data in any format.
Storage from 0 kb to 5 TB.

Keys: Each s3 Bucket has a unique Identifier called as a key.

S3 Operations :

Create/Delete buckets.
Write/Read/Delete an object.
Key list in buckets

Durability and Availability :

99.999999999% durability [Will my data still be there?]
99.99% availability [can I access my data right now]?
To protect from accident deletions use versioning, cross-region and MFA Deletion.

Data Consistency :

Provides read-after-write consistency.
Updates to a single key are atomic — you’ll never get a mix of the data (old and new)

Access Control :

Private by default
ACLs (Access control List)
Grants coarse-grained permissions : READ, WRITE, and Full-Control at bucket level or object level.
S3 Bucket policies are recommended access control for S3.
Bucket policies
Bucket policies are similar to IAM role but on bucket level.
They include an explicit reference with IAM principle in policies.
S3 bucket policies allows cross-accounts access to S3.
You can allow and block anyone using CIDR and during what time of the day.

S3 Advance Features

Prefixes and Delimiters :

S3 uses a flat structure in bucket.
Use prefix and delimiter for logical organise new data and maintain the hierarchy of the data.
Can apply access control on prefixes and delimiters

Storage Class.

S3 Standard
Delivers first-low byte latency and high throughput.
Best choice for frequently accessed data.
S3 Standard IA(Infrequent Access)
Same durability as Standard , low latency, high throughputs.
Designed for long-lived data, less frequent access
Pricing is less than standard, minimum duration limit of data is 30 days
Minimum object with 128kb.
per-Gb retrieval cost.
RSS (Reduced Redundancy Storage)
Lower durability 99.99%
Better for thumbnails like easy reproducible data.
Cheaper than over storages.

Amazon Glacier [small intro]

Best for storing achieves
Free data access to 5% data in glacier.
Takes hours to restore
Makes a copy to RSS but data retains in the glacier until you delete it.
Its’s a standalone service.

Object lifecycle Management

Hot(Frequently access data)→ Warm (Less frequent data)→ Cold (Long-term archival data)

Reduce storage cost and we can automate storage class based on the time.

Initially on Amazon S3 Standard
After 30 days, transition to IA
After 90 days, Glacier
After 3 years, delete.

Encryption

Best practice is to Encrypt data at flight and at rest.
Amazon SSL Api endpoints
Assures that all data will be encrypted while in transit using HTTPS protocol.
Data encrypt at rest
Use SSE (Server Side Encryption)
AWS encrypt data at object level before writing to disk and decrypt when you retrieve it.
SSL and Amazon KMS (Key management service) uses 256-bit AES (Advance Encryption Standards)
Client side Encryption.
SSE-S3 (AWS manages key)
Fully integrated “check-box-style”
AWS handles this
SSE - KMS (key management Service)
Keys are manages by us
Increase access control at object level
Track user activity at object level and user level
Attempt to login, failed login. Etc
SSE-C (Customer - provided key)
You control every thing
Best practice
Use server-side encryption with SSE-S3 or SSE-KMS

Versioning

It’s a protection against accidental deletion.
Create multiple versions of each object in your bucket.
Apply at bucket level
Once started it can’t be stopped, only suspended
Data can be retrieved using previous version of your bucket

MFA Delete

Adds another layer of data protection at buckets
Maybe OTP or hardware for deletion something.

Multipart Upload

Use for uploading larger files in multi parts
Better network utilisation
Ability to pause and resume
Can upload with unknown file size
It’s a three way process
Initialisation
Uploading into parts
Completions
Upload in any manner
Aws re arrange the parts at their end
Its better to use multi part if file is more than 100 MB

Cross Region Replication

Asynchronously copy S3 bucket from one region to another
Reduce latency
Versioning must be turned on to both destination and source buckets
Must use IAM policy to replicate object on your behalf
Any changes on source bucket trigger replication on destination bucket.

Logging

For sureliance
Best practice is to save log in same bucket using a prefix

Event Notification

Setup at bucket Level
Uses aws SNS or SQS
Trigger messages on events on your bucket.

Best Practices, patters and performance

Make s3 as blob storage and index data to database this enables quick searches and complex queries to run.
If S3 uses extensive GET- requests use Amazon Cloud front distribution as caching layer

Amazon Glacier

Used for long term Backup
Stores in Zip files

Archives

immutable after created
Data stores in archives which can store upto 40 TB
Can have unlimited number of archives
Automated encrypted

Vaults

Container for archives
AWS account have limits upto 1000 vaults
Easily deploy and enforce compliance controls for Glacier vault
Write once read many (because they are immutable)

🪴 Prakritidev Verma

Explorer

"S3 and Glacier"

Buckets : Global namespace [best practice to save name as organisation name].

Graph View

Backlinks