Learning the code way: Serverless - My delayed transition

The tech world has been moving away from servers for quite some time. While I was introduced to Lambda a few years back, I was hesitant to leave what I knew and jump into this new style of development. My initial experiences left me feeling Lambda is expensive, Lambda cannot replace having your own server up all the time, Lambda is for niche use cases etc.

Cut to 2020 and the number of use cases that AWS Lambda can solve has grown. Serverless is the way to go for a lot of use cases. I have been using AWS extensively and am now encouraged to re enter the serverless world with my set of use cases. This blog will be the first of my AWS series - one where I discuss Lambda's suitability for various solutions.

Before I start:

What is serverless ?
This is a style of computing where you provide some API (service) without having a server where the code is deployed.

So there is a server ?
Yes, you cannot have your code running without it being hosted on some machine (AKA the server). But with serverless, you do not setup/own/manage a server instance. Your cloud provider gives it you. (In this case AWS)

So serverless means cloud ?
Well, cloud is basically a technology for companies to run their servers without actually buying a server machine and setting up the infrastructure at a physical location. Cloud companies like Amazon own the large farms of server machines (or data centers) on which you can 'lease' a 'server instance' and run your code. Serverless is a cloud computing model which goes one step further. Do not worry about the server instance anymore. Just write your code and let the cloud (AWS in this case) worry about executing your code when the use case arises.

So when does my code execute ?
Serverless computing model is an event driven model. The code executes in response to an event. Event could be to process a API request, process a notification, process a set of messages in a queue or a cron trigger.

What happens when this 'event' occurs ?
Whenever a serverless code needs to be executed, AWS will pickup the code that you have written and run it on their server. This platform/server where your code is executed is known as AWS Lambda.

Let us start with some example. Consider that you have an S3 bucket to which some service is uploading a file. When the file is uploaded you need to run a code that parses the file, extracts some information and puts the information in a different S3 location. In the pre lambda world, you would setup an EC2 instance (i.e. a server instance) and periodically keep checking the S3 location for a new file. This is inefficient server usage if the file is added like once a day. Why pay for a host that does some processing once in a day ? Instead lets use AWS Lambda.

With AWS Lambda you need an event to trigger your code. In this case S3 provides notifications whenever a file is added to a bucket. This notification can be used to trigger your lambda code.

To start with, I am going to setup two buckets - my source bucket where the service will upload the file to process and my target bucket where my Lambda will upload the processed file.

The service file contains a single word.

Robin

My Lambda will read the word from the file, process it (prefix it with 'Hello') and then store the resultant file in a different S3 bucket.

Pre-requisite steps that I had to do to run this example:

Setup an AWS Account
Create an admin user and retrieve the access key Id and secret Key. (AWS Doc Link)
Install AWS CLI (Doc Link) and configure it for use with admin user created in step 2 (Doc Link)

Next step was to create a Lambda that reacts on S3 events. For this I setup a bucket in s3. S3 buckets can be configured to send notifications on changes within the bucket:

The above image means any PUT event on this bucket should result in a notification being sent to a Lambda function named S3EventLambda. (Other alternatives are SNS and SQS)
(S3 provides a lot more notifications. For details on configuring the bucket refer the AWS doc)

Now to look at the Lambda Function. I coded this in java.

public class S3FileUploadHandler implements
        //Lambda request handlers implement AWS Lambda Function application logic
        // using plain old java objects as input and output.
        //S3Event is input, Void is output
        RequestHandler<S3Event, Void> {

    private static final String DESTINATION_BUCKET_NAME = "mylambda-destination";

    /*
     * The code to be triggered
     * S3Event - S3 EventNotification item sent by S3 to SQS, SNS, or Lambda
     * Context - When Lambda runs your function, it passes a context object to the handler.
     *           This object provides methods and properties that provide information about the invocation,
     *           function, and execution environment.
     */
    public Void handleRequest(S3Event s3event, Context context) {
        LambdaLogger logger = context.getLogger();
        logger.log("In Handler: Executing " + context.getFunctionName() + ", " + context.getFunctionVersion()
                + ", " + context.getMemoryLimitInMB() + ", " + context.getRemainingTimeInMillis());
        logger.log(s3event.toJson());
        try {
            //Get the details of the notification
            S3EventNotificationRecord record = s3event.getRecords().get(0);
            String srcBucket = record.getS3().getBucket().getName();
            // Object key may have spaces or unicode non-ASCII characters.
            String srcKey = record.getS3().getObject().getUrlDecodedKey();

            //Get the text from S3 file
            String text = getTextFromSrcFile(srcBucket, srcKey);

            //Process the input
            String output = processInput(text);
            writeToS3(srcKey, output);
            System.out.println("Successfully resized " + srcBucket + "/"
                    + srcKey + " and uploaded to " + DESTINATION_BUCKET_NAME + "/" + srcKey);

        } catch (IOException | AmazonServiceException e) {
            logger.log("Error " + e.getMessage());
            throw new RuntimeException(e);
        } finally {
            System.out.println("Completed Handler: Executing " + context.getFunctionName()
                    + ", " + context.getFunctionVersion());
        }
        return null;
    }

As seen the class implements RequestHandler Interface. The handleRequest method receives the S3Event published by S3 (when a file is put in the bucket) and a Context instance which provides Lambda details as well as a Logger instance.
The code does nothing fancy. It reads the uploaded file and then writes it to a different bucket with "Hello" prefixed to it.

    private void writeToS3(String srcKey, String output) throws IOException {
        //Create the destination S3 file
        InputStream is = new ByteArrayInputStream(output.getBytes());
        // Set Content-Length and Content-Type
        ObjectMetadata meta = new ObjectMetadata();
        ByteArrayOutputStream os = new ByteArrayOutputStream();
        os.write(output.getBytes());
        meta.setContentLength(os.size());
        meta.setContentType("text/plain");
        // Uploading to S3 destination bucket
        System.out.println("Writing to: " + DESTINATION_BUCKET_NAME + "/" + srcKey);
        AmazonS3 s3Client = AmazonS3ClientBuilder.defaultClient();
        s3Client.putObject(DESTINATION_BUCKET_NAME, srcKey, is, meta);
    }

    private String processInput(String text) {
        return "Hello " + text + " !";
    }

    private String getTextFromSrcFile(String srcBucket, String srcKey) throws IOException {
        // Download the file from S3 into a stream
        AmazonS3 s3Client = AmazonS3ClientBuilder.defaultClient();
        S3Object s3Object = s3Client.getObject(new GetObjectRequest(srcBucket, srcKey));
        InputStream objectData = s3Object.getObjectContent();

        StringBuilder textBuilder = new StringBuilder();
        try (Reader reader = new BufferedReader(new InputStreamReader(objectData))) {
            int c = 0;
            while ((c = reader.read()) != -1) {
                textBuilder.append((char) c);
            }
        }
        return textBuilder.toString();
    }

It is pretty plain (and clunky) - my version of Hello World !

To make this a Lambda, we need to create a single jar that includes all the dependent jars too. For this code to work ( I am using maven) I needed to add three dependencies:

aws-lambda-java-core - interface definitions for AWS Lambda
aws-lambda-java-events - event interface definitions supported by AWS Lambda.
aws-java-sdk-s3 - AWS Java SDK for Amazon S3

I also added aws-lambda-java-log4j2 dependency, which appends Invocation Request Id to logs [Link]

There is a maven-shade-plugin available. Executing the below command results in a uber jar being created:

mvn clean package shade:shade

Using this jar I setup a new Lambda function through UI:

Function Name : S3EventLamda
Runtime : Java 11
Execution Role: Needed by AWS to execute the function and to upload logs. From java docs - an execution role grants the function permission to upload logs. Lambda assumes the execution role when you invoke your function, and uses it to create credentials for the AWS SDK and to read data from event sources. [Link]
Details on Memory required : 256 MB
timeout: 30 secs.

With this I uploaded the jar from UI and my function was ready to go.

To test the Lambda, I used the command line:

aws lambda invoke --function-name S3EventLambda --invocation-type Event --payload file://EventSample.txt --region us-east-1 --profile burner response.txt --cli-binary-format raw-in-base64-out

The payLoad file is an S3 Event as shown here
The execution resulted in a file being created in destination S3 bucket.

The cloud watch logs for execution came out under "/aws/lambda/S3EventLambda" logGroup. Every time an update is made to Lambda, it posts the logs under a new log stream.

START RequestId: c2af3e35-f4b1-4ea2-b327-114ced54e163 Version: $LATEST
In Handler: Executing S3EventLambda, $LATEST, 256, 28560
{
    "Records": [
        {
            "awsRegion": "us-west-2",
            "eventName": "ObjectCreated:Put",
            "eventSource": "aws:s3",
            "eventTime": "1970-01-01T00:00:00.000Z",
            "eventVersion": "2.0",
            "requestParameters": {
                "sourceIPAddress": "127.0.0.1"
            },
            "responseElements": {
                "x-amz-id-2": "FMyUVURIY8/IgAtTv8xRjskZQpcIZ9KG4V5Wp6S7S/JRWeUWerMUE5JgHvANOjpD",
                "x-amz-request-id": "C3D13FE58DE4C810"
            },
            "s3": {
                "configurationId": "testConfigRule",
                "bucket": {
                    "name": "service-destination",
                    "ownerIdentity": {
                        "principalId": "A3NL1KOZZKExample"
                    },
                    "arn": "arn:aws:s3:::service-destination"
                },
                "object": {
                    "key": "World.txt",
                    "size": 1024,
                    "eTag": "d41d8cd98f00b204e9800998ecf8427e",
                    "versionId": "096fKKXTRTtl3on89fVO.nfljtsv6qko",
                    "sequencer": "",
                    "urlDecodedKey": "World.txt"
                },
                "s3SchemaVersion": "1.0"
            },
            "userIdentity": {
                "principalId": "AIDAJDPLRKLG7UEXAMPLE"
            },
            "glacierEventData": null
        }
    ]
}
Writing to: mylambda-destination/World.txt
Apr 10, 2020 3:56:33 AM com.amazonaws.util.Base64 warn
WARNING: JAXB is unavailable. Will fallback to SDK implementation which may be less performant
Successfully resized service-destination/World.txt and uploaded to mylambda-destination/World.txt
Completed Handler: Executing S3EventLambda, $LATEST
END RequestId: c2af3e35-f4b1-4ea2-b327-114ced54e163
REPORT RequestId: c2af3e35-f4b1-4ea2-b327-114ced54e163 Duration: 25579.62 ms Billed Duration: 25600 ms Memory Size: 256 MB Max Memory Used: 155 MB Init Duration: 239.09 ms

Along with the logs emitted by the Lambda function code, there are 3 log lines that AWS Lambda adds - beginning with 'START', 'END' and 'REPORT'.

Learning the code way

Search This Blog

Friday, 10 April 2020

Serverless - My delayed transition

1 comment:

Total Pageviews