Buckets : Global namespace [best practice to save name as organisation name].
Name with 63 lowercase number hyphens and periods.
100 buckets per account.
Created at a specific reason.
Bucket can store unlimited number of objects
Private by default
Objects: Entities or files n Amazon in buckets.
Virtually can store any data in any format.
Storage from 0 kb to 5 TB.
Keys: Each s3 Bucket has a unique Identifier called as a key.
S3 Operations :
Create/Delete buckets.
Write/Read/Delete an object.
Key list in buckets
Durability and Availability :
99.999999999% durability [Will my data still be there?]
99.99% availability [can I access my data right now]?
To protect from accident deletions use versioning, cross-region and MFA Deletion.
Data Consistency :
Provides read-after-write consistency.
Updates to a single key are atomic — you’ll never get a mix of the data (old and new)
Access Control :
Private by default
ACLs (Access control List)
Grants coarse-grained permissions : READ, WRITE, and Full-Control at bucket level or object level.
S3 Bucket policies are recommended access control for S3.
Bucket policies
Bucket policies are similar to IAM role but on bucket level.
They include an explicit reference with IAM principle in policies.
S3 bucket policies allows cross-accounts access to S3.
You can allow and block anyone using CIDR and during what time of the day.
S3 Advance Features
Prefixes and Delimiters :
S3 uses a flat structure in bucket.
Use prefix and delimiter for logical organise new data and maintain the hierarchy of the data.
Can apply access control on prefixes and delimiters
Storage Class.
S3 Standard
Delivers first-low byte latency and high throughput.
Best choice for frequently accessed data.
S3 Standard IA(Infrequent Access)
Same durability as Standard , low latency, high throughputs.
Designed for long-lived data, less frequent access
Pricing is less than standard, minimum duration limit of data is 30 days
Minimum object with 128kb.
per-Gb retrieval cost.
RSS (Reduced Redundancy Storage)
Lower durability 99.99%
Better for thumbnails like easy reproducible data.
Cheaper than over storages.
Amazon Glacier [small intro]
Best for storing achieves
Free data access to 5% data in glacier.
Takes hours to restore
Makes a copy to RSS but data retains in the glacier until you delete it.
Its’s a standalone service.
Object lifecycle Management
Hot(Frequently access data)→ Warm (Less frequent data)→ Cold (Long-term archival data)
Reduce storage cost and we can automate storage class based on the time.
Initially on Amazon S3 Standard
After 30 days, transition to IA
After 90 days, Glacier
After 3 years, delete.
Best practice is to Encrypt data at flight and at rest.
Amazon SSL Api endpoints
Assures that all data will be encrypted while in transit using HTTPS protocol.
Data encrypt at rest
Use SSE (Server Side Encryption)
AWS encrypt data at object level before writing to disk and decrypt when you retrieve it.
SSL and Amazon KMS (Key management service) uses 256-bit AES (Advance Encryption Standards)
Client side Encryption.
SSE-S3 (AWS manages key)
Fully integrated “check-box-style”
AWS handles this
SSE - KMS (key management Service)
Keys are manages by us
Increase access control at object level
Track user activity at object level and user level
Attempt to login, failed login. Etc
SSE-C (Customer - provided key)
You control every thing
Best practice
Use server-side encryption with SSE-S3 or SSE-KMS
It’s a protection against accidental deletion.
Create multiple versions of each object in your bucket.
Apply at bucket level
Once started it can’t be stopped, only suspended
Data can be retrieved using previous version of your bucket
MFA Delete
Adds another layer of data protection at buckets
Maybe OTP or hardware for deletion something.
Multipart Upload
Use for uploading larger files in multi parts
Better network utilisation
Ability to pause and resume
Can upload with unknown file size
It’s a three way process
Uploading into parts
Upload in any manner
Aws re arrange the parts at their end
Its better to use multi part if file is more than 100 MB
Cross Region Replication
Asynchronously copy S3 bucket from one region to another
Reduce latency
Versioning must be turned on to both destination and source buckets
Must use IAM policy to replicate object on your behalf
Any changes on source bucket trigger replication on destination bucket.
For sureliance
Best practice is to save log in same bucket using a prefix
Event Notification
Setup at bucket Level
Uses aws SNS or SQS
Trigger messages on events on your bucket.
Best Practices, patters and performance
Make s3 as blob storage and index data to database this enables quick searches and complex queries to run.
If S3 uses extensive GET- requests use Amazon Cloud front distribution as caching layer
Amazon Glacier
- Used for long term Backup
- Stores in Zip files
- immutable after created
- Data stores in archives which can store upto 40 TB
- Can have unlimited number of archives
- Automated encrypted
- Container for archives
- AWS account have limits upto 1000 vaults
- Easily deploy and enforce compliance controls for Glacier vault
- Write once read many (because they are immutable)