Search This Blog

Tuesday, 2 June 2020

More on DAX

In the previous post we setup a DAX client and tested the GetItem performance. This post will look at other DAX features

BatchGetItem is also supported by DAX. DAX will fetch the items in Cache and for rest it will return the items from Dynamo. Similar is the story for BatchGet
In addition to GetItem, the DAX client also supports BatchGetItem requests. 
BatchGetItem is essentially a wrapper around one or more GetItem requests, so DAX
treats each of these as an individual GetItem operation.
All Write operations are treated as 'write-through':
BatchWriteItem, UpdateItem, DeleteItem, PutItem operations, data is first written 
to the DynamoDB table, and then to the DAX cluster. The operation is successful 
only if the data is successfully written to both the table and to DAX.
These are operations against Item Cache. DAX also provides a QueryCache
DAX caches the results from Query and Scan requests in its query cache. However, 
these results don't affect the item cache at all. When your application issues 
a Query or Scan request with DAX, the result set is saved in the query cache — 
not in the item cache. You can't "warm up" the item cache by performing a Scan 
operation because the item cache and query cache are separate entities.

 private void scanItemsTest(LambdaLogger lambdaLogger, Table dbTable, Table daxTable) {
        //Test 1 - Get without DAX
        //Lambda needs
        lambdaLogger.log("Tests for GetItem");
        lambdaLogger.log("Dynamo Tests");
        computeScanItemDuration(lambdaLogger, dbTable);
        //Test 2 - Get with DAX - no entry in cache
        lambdaLogger.log("DAX Tests - not in Cache");
        computeScanItemDuration(lambdaLogger, daxTable);
        //Test 3 - Get with DAX - entry in cache
        lambdaLogger.log("DAX Tests - present in Cache");
        computeScanItemDuration(lambdaLogger, daxTable);
    }

    private void computeScanItemDuration(LambdaLogger lambdaLogger, Table table) {
        ScanFilter scanFilter = new ScanFilter("age")
                .between(20, 25);
        long startTime = System.nanoTime();
        ItemCollection<ScanOutcome> outcomeItemCollection = table.scan(scanFilter);
        int totalAge = 0, totalRecs = 0;
        int pageNo = 1;
        for (Page<Item, ScanOutcome> page : outcomeItemCollection.pages()) {
            ScanOutcome lowLevelResult = page.getLowLevelResult();
            totalAge += lowLevelResult.getItems().stream()
                    .map(item -> (BigDecimal) (item.get("age")))
                    .mapToInt(BigDecimal::intValue).sum();
            pageNo++;
            totalRecs += lowLevelResult.getItems().size();
        }
        lambdaLogger.log("Tested total of " + totalRecs + " with avg age found to be " + (totalAge / totalRecs) + ". Total Pages " + (pageNo-1));
        lambdaLogger.log("Average FetchTime = " + (System.nanoTime() - startTime) + " nano seconds");
    }
The results of the test are as below:
START RequestId: 7de2a34d-abd0-4d7f-9d3c-87ae3cf117ce Version: $LATEST
Tests for ScanItem
Dynamo Tests
Tested total of 10042 with avg age found to be 22. Total Pages 4
Average FetchTime = 11459072286 nano seconds
DAX Tests - not in Cache
Tested total of 10042 with avg age found to be 22. Total Pages 4
Average FetchTime = 4819390513 nano seconds
DAX Tests - present in Cache
Tested total of 10042 with avg age found to be 22. Total Pages 4
Average FetchTime = 1823347748 nano seconds
END RequestId: 7de2a34d-abd0-4d7f-9d3c-87ae3cf117ce
REPORT RequestId: 7de2a34d-abd0-4d7f-9d3c-87ae3cf117ce 
Duration: 18234.71 ms Billed Duration: 18300 ms 
Memory Size: 256 MB Max Memory Used: 142 MB Init Duration: 1757.49 ms 

As seen the Query cache times are lower than Dynamo. However there are some caveats to using the Query Cache.
Updates to the item cache, or to the underlying DynamoDB table, do not invalidate 
or modify the results stored in the query cache. Your application should consider 
the TTL value for the query cache and how long your application can tolerate 
inconsistent results between the query cache and the item cache.
If we added some records to table with age 25, we would expect the average age to change. However since DAX query cache is not aware of these changes it would not reflect the update.
There is another corner case with DAX -

DynamoDb read requests (Get, Scan, query) are eventually consistent by default. 
DAX attempts to return results for such queries. If the requests are marked as 
strongly consistent reads, than DAX ignores such calls forwarding to Dynamo.
It does not update it's cache data based on results of such calls.

One more cool aspect with DAX is 'negative caching'. Consider the case where we have multiple clients trying to locate an item in DAX. If the item exists, DAX would cache it for future use. But what if the item does not exists ?
DAX supports negative cache entries in both the item cache and the query cache.
A negative cache entry occurs when DAX can't find requested items in an 
underlying DynamoDB table. Instead of generating an error, DAX caches an 
empty result and returns that result to the user.
For example, suppose that an application sends a GetItem request to a DAX 
cluster, and that there is no matching item in the DAX item cache. This causes
DAX to read the corresponding item from the underlying DynamoDB table. If the 
item doesn't exist in DynamoDB, DAX stores an empty item in its item cache
and returns the empty item to the application. Now suppose that the 
application sends another GetItem request for the same item. DAX finds the 
empty item in the item cache and returns it to the application immediately. 
It does not consult DynamoDB at all.
A negative cache entry remains in the DAX item cache until its item TTL has 
expired, its LRU is invoked, or the item is modified using PutItem, 
UpdateItem, or DeleteItem.
The DAX query cache handles negative cache results in a similar way. If an 
application performs a Query or Scan, and the DAX query cache doesn't 
contain a cached result, DAX sends the request to DynamoDB. If there are 
no matching items in the result set, DAX stores an empty result set in 
the query cache and returns the empty result set to the application. 
Subsequent Query or Scan requests yield the same (empty) result set, 
until the TTL for that result set has expired.

1 comment:

  1. It's a great pleasure reading your post. Also visit here: Feeta.pk Houses for sale in karachi . I’d really like to help appreciate it with the efforts you get with writing this post.

    ReplyDelete