Category Archives: Databases

Amazon AWS – Installing Redis on EBS

In this step-by-step guide i’ll show you how to install Redis on AWS (Amazon Linux AMI).

I’ll assume you’re performing steps below as a su (sudo -s).

  1. First thing you need is to have following tools installed:
    > gcc
    > gcc-c++
    > make

    yum -y install gcc gcc-c++ make
    

    _

  2. Download Redis:
    cd /usr/local/src
    wget http://download.redis.io/releases/redis-2.8.12.tar.gz
    tar xzf redis-2.8.12.tar.gz
    rm -f redis-2.8.12.tar.gz
    

    _

  3. Build it:
    cd redis-2.8.12
    make distclean
    make
    

    _

  4. Create following directories and copy binaries:
    mkdir /etc/redis /var/redis
    cp src/redis-server src/redis-cli /usr/local/bin
    

    _

  5. Copy Redis template configuration file into /etc/redis/ (using Redis port number instance as its name (according to best practices mentioned on Redis site)):
    cp redis.conf /etc/redis/6379.conf
    

    _

  6. Create directory inside /var/redis that will act as working/data directory for this Redis instance:
    mkdir /var/redis/6379
    

    _

  7. Edit Redis config file to make necessary changes:
    nano /etc/redis/6379.conf
    

    _

  8. Make following changes to 6379.conf
    > Set daemonize to yes (by default it is set to no).
    > Set pidfile to /var/run/redis.pid
    > Set preferred loglevel
    > Set logfile to /var/log/redis_6379.log
    > Set dir to /var/redis/6379

    _

  9. Don’t copy the standard Redis init script from utils directory into /etc/init.d (as it’s not Amazon Linux AMI/chkconfig compliant), instead download the following:
    wget https://raw.githubusercontent.com/saxenap/install-redis-amazon-linux-centos/master/redis-server
    

    _

  10. Move and chmod downloaded redis init script:
    mv redis-server /etc/init.d
    chmod 755 /etc/init.d/redis-server
    

    _

  11. Edit redis-server init script and change redis conf file name as following:
    > REDIS_CONF_FILE=”/etc/redis/6379.conf”

    nano /etc/init.d/redis-server
    

    _

  12. Auto-enable Redis instance:
    chkconfig --add redis-server
    chkconfig --level 345 redis-server on
    

    _

  13. Start Redis:
    service redis-server start
    

    _

  14. (optional) Add ‘vm.overcommit_memory = 1’ to /etc/sysctl.conf (otherwise background save may fail under low memory condition – according to info on Redis site):
    > vm.overcommit_memory = 1

    nano /etc/sysctl.conf
    

    _

  15. Activate the new sysctl change:
    sysctl vm.overcommit_memory=1
    

    _

  16. Try pinging your instance with redis-cli:
    /usr/local/bin/redis-cli ping
    

    _

  17. Do few tests with redis-cli and check that the dump file is correctly stored into /var/redis/6379/ (you should find a file called dump.rdb):
    /usr/local/bin/redis-cli
    >set testkey testval
    >get testkey
    >del testkey
    >exit
    

    _

  18. Check that your Redis instance is correctly logging in the log file:
    cat /var/log/redis_6379.log
    

    _

 

And that would be basically it. Cheers.

 

Amazon AWS – SimpleDB – simple NoSQL

Basic information about Amazon SimpleDB Service:

 

AWS Free Tier availability:

  • 25 SimpleDB Machine Hours
  • 1GB of Storage

 

Developer Resources:

 

Functionality:

  • data-sets organized into domains (vs. tables in relational DB’s)
  • Domains are collections of items that are described by attribute-value pairs
  • automatically creates an index for every field in a domain
  • no need to pre-define a schema
  • scale-out by creating new domains on different instances
  • stores multiple geographically distributed copies of each domain to enable high availability and data durability.
  • a successful write (using PutAttributes, BatchPutAttributes, DeleteAttributes, BatchDeleteAttributes, CreateDomain or DeleteDomain) means that all copies of the domain will durably persist
  • by default, GetAttributes and Select perform an eventually consistent read (details below).
    • a consistent read can potentially incur higher latency and lower read throughput therefore it is best to use it only when an application scenario mandates that a read operation absolutely needs to read all writes that received a successful response prior to that read. For all other scenarios the default eventually consistent read will yield the best performance.
  • allows specifying consistency settings for each individual read request, so the same application could have disparate parts following different consistency settings.
  • currently enables domains to grow up to 10 GB each
  • initial allocation of domains is limited to 250

 

API Summary:

  • CreateDomain — Create a domain that contains your dataset.
  • DeleteDomain — Delete a domain.
  • ListDomains — List all domains.
  • DomainMetadata — Retrieve information about creation time for the domain, storage information both as counts of item names and attributes, as well as total size in bytes.
  • PutAttributes — Add or update an item and its attributes, or add attribute-value pairs to items that exist already. Items are automatically indexed as they are received.
  • BatchPutAttributes — For greater overall throughput of bulk writes, perform up to 25 PutAttribute operations in a single call.
  • DeleteAttributes — Delete an item, an attribute, or an attribute value.
  • BatchDeleteAttributes — For greater overall throughput of bulk deletes, perform up to 25 DeleteAttributes operations in a single call.
  • GetAttributes — Retrieve an item and all or a subset of its attributes and values.
  • Select — Query the data set in the familiar, “select target from domain_name where query_expression” syntax. Supported value tests are: =, !=, <, > <=, >=, like, not like, between, is null, is not null, and every (). Example: select * from mydomain where every(keyword) = ‘Book’. Order results using the SORT operator, and count items that meet the condition(s) specified by the predicate(s) in a query using the Count operator.

 

Consistency Options:

  • Eventually Consistent Reads (Default) — the eventual consistency option maximizes read performance (in terms of low latency and high throughput). However, an eventually consistent read (using Select or GetAttributes) might not reflect the results of a recently completed write (using PutAttributes, BatchPutAttributes, DeleteAttributes, BatchDeleteAttributes). Consistency across all copies of data is usually reached within a second; repeating a read after a short time should return the updated data.
  • Consistent Reads — in addition to eventual consistency, SimpleDB also gives flexibility to request a consistent read if your application, or an element of your application, requires it. A consistent read (using Select or GetAttributes with ConsistentRead=true) returns a result that reflects all writes that received a successful response prior to the read.

 

Transactions:

  • Conditional Puts/Deletes — enable to insert, replace, or delete values for one or more attributes of an item if the existing value of an attribute matches the value specified. If the value does not match or is not present, the update is rejected. Conditional Puts/Deletes are useful for preventing lost updates when different sources write concurrently to the same item.
    • Conditional puts and deletes are exposed via the PutAttributes and DeleteAttributes APIs by specifying an optional condition with an expected value.
    • For example, if the application is reserving seats or selling tickets to an event, you might allow a purchase (i.e., write update) only if the specified seat was still available (the optional condition). These semantics can also be used to implement functionality such as counters, inserting an item only if it does not already exist, and optimistic concurrency control (OCC). An application can implement OCC by maintaining a version number (or a timestamp) attribute as part of an item and by performing a conditional put/delete based on the value of this version number

 

Limits:

  • Domain size: 10 GB per domain, 1 billion attributes per domain
  • Domain name: 3-255 characters (a-z, A-Z, 0-9, ‘_’, ‘-‘, and ‘.’)
  • Domains per account: 250
  • Attribute name-value pairs per item: 256
  • Attribute name length: 1024 bytes
  • Attribute value length: 1024 bytes
  • Item name length: 1024 bytes
  • Attribute name, attribute value, and item name allowed characters: All UTF-8 characters that are valid in XML documents. Control characters and any sequences that are not valid in XML are returned Base64-encoded. For more information, seeWorking with XML-Restricted Characters.
  • Attributes per PutAttributes operation: 256
  • Attributes requested per Select operation: 256
  • Items per BatchDeleteAttributes operation: 25
  • Items per BatchPutAttributes operation: 25
  • Maximum items in Select response: 2500
  • Maximum query execution time: 5 seconds
  • Maximum number of unique attributes per Select expression: 20
  • Maximum number of comparisons per Select expression: 20
  • Maximum response size for Select: 1MB

 

 

 

Resources:

Amazon AWS – DynamoDB – advanced NoSQL

Basic information about Amazon DynamoDB Service:

 

AWS Free Tier availability:

  • 100MB of Storage,
  • 5 Units of Write Capacity,
  • 10 Units of Read Capacity
  • up to 40 million free operations each month with eventually consistent reads, or 25 million operations each month with strictly consistent reads

 

Developer Resources:

 

Features:

  • Local secondary indexes
  • SSD-storage
  • automatic 3-way replication
  • you pay a flat, hourly rate based on the capacity you reserve
  • when creating or updating tables, you specify how much capacity you wish to reserve for reads and writes

 

Limits:

  • Table name: allowed characters are a-z, A-Z, 0-9, ‘_’ (underscore), ‘-‘ (dash), and ‘.’ (dot). Names can be between 3 and 255 characters long.
  • Local secondary index name: allowed characters are a-z, A-Z, 0-9, ‘_’ (underscore), ‘-‘ (dash), and ‘.’ (dot). Names can be between 3 and 255 characters long.
  • Table size: No practical limit in number of bytes or items
  • Tables per account: 256 per region (default). Increase can be requested.
  • Hash or hash-and-range primary key: No practical limit.
  • Number of hash key values: No practical limit.
  • Hash-and-range primary key: No practical limit for non-indexed tables (otherwise total sizes of all table and index items cannot exceed 10 GB)
  • Number of range keys per hash value: No practical limit for non-indexed tables (otherwise total sizes of all table and index items cannot exceed 10 GB)
  • Provisioned throughput capacity unit sizes: One read capacity unit = one strongly consistent read per second, or two eventually consistent reads per second, for items up 4 KB in size. One write capacity unit = one write per second, for items up to 1 KB in size.
  • Provisioned throughput minimum per table: 1 read capacity unit and 1 write capacity unit
  • Provisioned throughput limits: can request an increase but default values are:
    • US East (Northern Virginia) Region:
      • Per table – 40,000 read capacity units or 40,000 write capacity units
      • Per account – 80,000 read capacity units or 80,000 write capacity units
    • All Other Regions:
      • Per table – 10,000 read capacity units or 10,000 write capacity units
      • Per account – 20,000 read capacity units or 20,000 write capacity units
  • UpdateTable: Limits when increasing provisioned throughput: You can call UpdateTable as often as necessary to increase provisioned throughput. You can increase ReadCapacityUnits or WriteCapacityUnits for a table, subject to these conditions:
    • You can call the UpdateTable API to increase ReadCapacityUnits or WriteCapacityUnits (or both), up to twice their current values.
    • The new provisioned throughput settings do not take effect until the UpdateTable operation is complete.
    • You can call UpdateTable multiple times, until you reach the desired throughput capacity for your table.
  • UpdateTable: Limits when decreasing provisioned throughput: You can reduce the provisioned throughput on a table no more than four times in a single UTC calendar day. These reductions can be any of the following operations:
    • Decrease ReadCapacityUnits.
    • Decrease WriteCapacityUnits.
    • Decrease both ReadCapacityUnits and WriteCapacityUnits in a single request. This counts as one of your allowed reductions for the day.
  • Maximum concurrent Control Plane API requests (includes cumulative number of tables in the CREATINGUPDATING or DELETING state): In general, you can have up to 10 of these requests running concurrently. The only exception is when you are CREATING a table and you have defined a local secondary index on that table. You can only have one such request running at a time.
  • Maximum number of local secondary indexes per table: You can define up to 5 local secondary indexes per table.
  • Maximum number of projected attributes per table (local secondary indexes only): You can project a total of up to 20 attributes into all of a table’s local secondary indexes. This only applies to user-specified projected attributes.

    In a CreateTable operation, if you specify a ProjectionType of INCLUDE, the total count of attributes specified in NonKeyAttributes, summed across all of the local secondary indexes, must not exceed 20. If you project the same attribute name into two different indexes, this counts as two distinct attributes when determining the total.

    This limit does not apply for indexes with a ProjectionType of KEYS_ONLY or ALL.

  • Attribute name lengths: The following attribute names are length-restricted:
    • Primary key attribute names.
    • The names of any user-specified projected attributes (applicable only to local secondary indexes). In a CreateTableoperation, if you specify a ProjectionType of INCLUDE, then the names of the attributes in the NonKeyAttributes parameter are length-restricted. The KEYS_ONLY and ALL projection types are not affected.
    • For any of the attribute names listed above, the name must be between 1 and 255 characters long, inclusive. The name can be any UTF-8 encodable character, but the total size of the UTF-8 string after encoding cannot exceed 255 bytes.
  • Item size: Cannot exceed 64 KB which includes both attribute name binary length (UTF-8 length) and attribute value lengths (again binary length). The attribute name counts towards the size limit. For example, consider an item with two attributes: one attribute named “shirt-color” with value “R” and another attribute named “shirt-size” with value “M”. The total size of that item is 23 bytes. These limits apply to items stored in tables, and also to items in local secondary indexes.
  • Attribute values: cannot be null or empty
  • Attribute name-value pairs per item: The cumulative size of attributes per item must be under 64 KB
  • Hash primary key attribute value: 2048 bytes
  • Range primary key attribute value: 1024 bytes
  • String: All strings must conform to the UTF-8 encoding. Since UTF-8 is a variable width encoding, string sizes are determined using the UTF-8 bytes.
  • Number: A number can have up to 38 digits precision and can be between 10^-128 to 10^+126.
  • Maximum number of values in an attribute set: No practical limit on the quantity of values, as long as the item containing the values fits within the 64 KB item limit.
  • BatchGetItem item maximum per operation: Up to 100 items retrieved, with the request size not exceeding 1 MB.
  • BatchWriteItem item maximum per operation: Up to 25 items put or delete operations, with the request size not exceeding 1 MB.
  • Query: Result set limited to 1 MB per API call. You can use the LastEvaluatedKey from the query response to retrieve more results.
  • Scan: Scanned data set size maximum is 1 MB per API call. You can use the LastEvaluatedKey from the scan response to retrieve more results

 

 

 

Resources: