Tag Archives: Amazon AWS

Amazon AWS – S3 – Simple Storage Service

Basic information about Amazon S3 Service:

 

AWS Free Tier availability:

  • 5GB storage,
  • 20,000 Get Requests,
  • 2,000 Put Requests

 

Developer Resources:

 

Functionality:

  • write, read, and delete objects containing from 1 byte to 5 terabytes of data each. The number of objects you can store is unlimited.
  • each object is stored in a bucket and retrieved via a unique, developer-assigned key.
  • a bucket can be stored in one of several Regions. You can choose a Region to optimize for latency, minimize costs, or address regulatory requirements. Amazon S3 is currently available in following regions:
    • US Standard (automatically routes requests to facilities in Northern Virginia or the Pacific Northwest using network maps)
    • US West (Oregon),
    • US West (Northern California),
    • EU (Ireland),
    • Asia Pacific (Singapore),
    • Asia Pacific (Tokyo),
    • Asia Pacific (Sydney),
    • South America (Sao Paulo),
    • GovCloud (US)
  • objects stored in a Region never leave the Region unless you transfer them out. For example, objects stored in the EU (Ireland) Region never leave the EU.
  • authentication mechanisms are provided to ensure that data is kept secure from unauthorized access. Objects can be made private or public, and rights can be granted to specific users.
  • options for secure data upload/download and encryption of data at rest are provided for additional data protection.
  • uses standards-based REST and SOAP interfaces designed to work with any Internet-development toolkit.
  • built to be flexible so that protocol or functional layers can easily be added. The default download protocol is HTTP.
  • a BitTorrent™ protocol interface is provided to lower costs for high-scale distribution.
  • provides functionality to simplify manageability of data through its lifetime. Includes options for segregating data by buckets, monitoring and controlling spend, and automatically archiving data to even lower cost storage options.
  • reliability backed with the Amazon S3 Service Level Agreement.

 

3 different S3 storage options are available:

  • standard storage
  • RRS
  • Glacier

 

Standard S3 storage:

  • designed for mission-critical and primary data storage
  • redundant storage of data in multiple facilities and on multiple devices within each facility.
  • synchronously stores data across multiple facilities before returning SUCCESS
  • calculates checksums on all network traffic to detect corruption of data packets when storing or retrieving data
  • performs regular, systematic data integrity checks and is built to be automatically self-healing.
  • further protection via Versioning to preserve, retrieve, and restore every version of every object stored in Amazon S3 bucket
  • by default, requests will retrieve the most recently written version. Older versions of an object can be retrieved by specifying a version in the request. Storage rates apply for every version stored.
  • backed with the SLA
  • designed for 99.999999999% durability and 99.99% availability of objects over a given year.
  • designed to sustain the concurrent loss of data in two facilities.

 

Reduced Redundancy Storage (RRS):

  • a storage option within S3 to reduce costs by storing non-critical, reproducible data at lower levels of redundancy than Amazon S3’s standard storage.
  • cost-effective, highly available solution for distributing or sharing content that is durably stored elsewhere, or for storing thumbnails, transcoded media, or other processed data that can be easily reproduced
  • stores objects on multiple devices across multiple facilities, providing 400 times the durability of a typical disk drive
  • does not replicate objects as many times as standard Amazon S3 storage
  • backed with the SLA
  • designed to provide 99.99% durability and 99.99% availability of objects over a given year (durability level corresponds to an average annual expected loss of 0.01% of objects).
  • designed to sustain the loss of data in a single facility.

 

Amazon Glacier:

  • extremely low-cost storage service as a storage option for data archival.
  • stores data for as little as $0.01 per gigabyte per month, and is optimized for data that is infrequently accessed and for which retrieval times of several hours are suitable. Examples include digital media archives, financial and healthcare records, raw genomic sequence data, long-term database backups, and data that must be retained for regulatory compliance.
  • like other S3 storage options (Standard or RRS), objects stored in Amazon Glacier using Amazon S3’s APIs or Management Console have an associated user-defined name.
  • you can get a real-time list of all of your Amazon S3 object names, including those stored using the Amazon Glacier option, only when using the Amazon S3 LIST API.
  • objects stored directly in Amazon Glacier using Amazon Glacier’s APIs cannot be listed in real-time, and have a system-generated identifier rather than a user-defined name. S3 maintains the mapping between your user-defined object name and the Amazon Glacier system-defined identifier
  • to restore Amazon S3 data that was stored in Amazon Glacier via the Amazon S3 APIs or Management Console, you first have to initiate a restore job.
  • Restore jobs typically complete in 3 to 5 hours. Once the job is complete, you can access your data through an Amazon S3 GET request.
  • backed with the SLA
  • designed for 99.999999999% durability and 99.99% availability of objects over a given year.
  • designed to sustain the concurrent loss of data in two facilities.

 

Common Use Cases:

  • Content Storage and Distribution
  • Storage for Data Analysis
  • Backup, Archiving and Disaster Recovery
  • Static Website Hosting

 

 

 

Resources:

Amazon AWS – SQS – Simple Queue Service

Basic information about Amazon SQS Service:

 

AWS Free Tier availability:

  • 1 million requests

 

Developer Resources:

 

Functionality:

  • no limits on number of queues and number of messages.
  • queue can be created in any region.
  • message payload can contain up to 256KB of text in any format. Each 64KB ‘chunk’ of payload is billed as 1 request. For example, a single API call with a 256KB payload will be billed as four requests.
  • messages can be sent, received or deleted in batches of up to 10 messages or 256KB. Batches cost the same amount as single messages, meaning SQS can be even more cost effective for customers that use batching.
  • long polling reduces extraneous polling to help you minimize cost while receiving new messages as quickly as possible. When your queue is empty, long-poll requests wait up to 20 seconds for the next message to arrive. Long poll requests cost the same amount as regular requests.
  • messages can be retained in queues for up to 14 days.
  • messages can be sent and read simultaneously.
  • when a message is received, it becomes “locked” while being processed. This keeps other computers from processing the message simultaneously. If the message processing fails, the lock will expire and the message will be available again. In the case where the application needs more time for processing, the “lock” timeout can be changed dynamically via the ChangeMessageVisibility operation.
  • developers can securely share Amazon SQS queues with others. Queues can be shared with other AWS accounts and Anonymously. Queue sharing can also be restricted by IP address and time-of-day.
  • when combined with Amazon SNS, developers can ‘fanout’ identical messages to multiple SQS queues in parallel. When developers want to process the messages in multiple passes, fanout helps complete this more quickly, and with fewer delays due to bottlenecks at any one stage. Fanout also makes it easier to record duplicate copies of your messages, for example in different databases.

 

Common design patterns with SQS and other AWS components:

  • Work Queues: Decoupling components of a distributed application that may not all process the same amount of work simultaneously.
  • Buffer and Batch Operations: Adding scalability and reliability to the architecture, and smooth out temporary volume spikes without losing messages or increasing latency.
  • Request Offloading: Moving slow operations off of interactive request paths by enqueing the request.
  • Fanout: Combined with SNS to send identical copies of a message to multiple queues in parallel for simultaneous processing.

 

Service interface:

  • CreateQueue: Create queues for use with your AWS account.
  • ListQueues: List your existing queues.
  • DeleteQueue: Delete one of your queues.
  • SendMessage: Add messages to a specified queue.
  • SendMessageBatch: Add multiple messages to a specified queue.
  • ReceiveMessage: Return one or more messages from a specified queue.
  • ChangeMessageVisibility: Change the visibility timeout of previously received message.
  • ChangeMessageVisibilityBatch: Change the visibility timeout of multiple previously received messages.
  • DeleteMessage: Remove a previously received message from a specified queue.
  • DeleteMessageBatch: Remove multiple previously received messages from a specified queue.
  • SetQueueAttributes: Control queue settings like the amount of time that messages are locked after being read so they cannot be read again.
  • GetQueueAttributes: Get information about a queue like the number of messages in it.
  • GetQueueUrl: Get the queue URL.
  • AddPermission: Add queue sharing for another AWS account for a specified queue.
  • RemovePermission: Remove an AWS account from queue sharing for a specified queue.

 

Message Lifecycle:

  • A system that needs to send a message will find an Amazon SQS queue, and use SendMessage to add a new message to it.
  • A different system that processes messages needs more messages to process, so it calls ReceiveMessage, and this message is returned.
  • Once a message has been returned by ReceiveMessage, it will not be returned by any other ReceiveMessage until the visibility timeout has passed. This keeps multiple computers from processing the same message at once.
  • If the system that processes messages successfully finishes working with this message, it calls DeleteMessage, which removes the message from the queue so no one else will ever process it. If this system fails to process the message, then it will be read by another ReceiveMessage call as soon as the visibility timeout passes.

 

 

 

Resources: