Case Study on Amazon SQS

Naveen Pareek
8 min readOct 2, 2021

As we all know that Amazon Web Service provides numerous inexpensive cloud computing services that make life easier for us to handle the humongous amounts of data in today’s technology-driven world.

As of 2021, AWS comprises over 200 products and services including computing, storage, networking, database, analytics, application services, deployment, management, machine learning, mobile, developer tools, and tools for the Internet of Things. However back in November 2004, a couple of years after the AWS platform was newly launched, the very first service that was released for public usage was “SQS-Simple Queue Service”. SQS is the oldest available service in AWS.

Now, let's start discussing about Amazon SQS in brief.

Amazon Simple Queue Service (Amazon SQS) is a distributed message queuing service introduced by Amazon.com in late 2004. It supports programmatic sending of messages via web service applications as a way to communicate over the Internet. SQS is intended to provide a highly scalable hosted message queue that resolves issues arising from the common producer-consumer problem or connectivity between producer and consumer.

It offers a secure, durable, and available hosted queue that lets you integrate and decouple distributed software systems and components. Amazon SQS can be described as commoditization of the messaging service. Well-known examples of messaging service technologies include IBM WebSphere MQ and Microsoft Message Queuing. Unlike these technologies, users do not need to maintain their own server. Amazon does it for them and sells the SQS service at a per-use rate.

Asynchronous workflows have always been the primary use case for SQS. Using queues ensures one component can keep running smoothly without losing data when another component is unavailable or slow.

Amazon SQS offers common constructs such as dead-letter queues and cost allocation tags. It provides a generic web services API that you can access using any programming language that the AWS SDK supports.

Amazon SQS Architecture

There are three main parts in a distributed messaging system:

◼ Components of the distributed system.
◼ A Queue (distributed on Amazon SQS servers)
◼ Messages in the Queue.

Components of a distributed system usually refer to producer and consumer. Producers are components that send messages to the queue and consumers are components that consume messages from the queue. There can be any number of producers and consumers. Some examples of those components are microservices, apps, APIs, etc.

The queue itself is distributed over several amazon SQS servers to fulfil the guarantee of “At least once” delivery.

How Does It Work?

Amazon SQS works on the concept like any other messaging queue with minor enhancements. Producer creates a message and puts it into a queue. We can have multiple producers and add multiple messages to the queue at the same time — you don’t have to worry about the traffic or peaks, SQS handles that for you.

In order to understand how SQS actually works, let us analyze a scenario, where you want to convert a video file into .mp3 file. You find this amazing mp3converter website and upload your video onto it. The following steps happen behind the scene…

  • The website stores the video in Amazon S3 and triggers a lambda function.
  • The lambda function retrieves data from the video and adds it to the SQS using the SendMessage Request. The data can be anything like the duration or the background noise in the video in S3. Each message that is added to the queue will have a deduplication ID associated with it.
  • SQS acts as a taskmaster. It has details of all the tasks to be performed and it is waiting for an EC2 instance (consumer in this example) to pull the request from the SQS by sending RecieveMessage Request.
  • Once assigned, the EC2 instance reads the task and completes it by creating an mp3 file and stores it back in the same s3 or different s3.
  • After completing the job, the EC2 instance goes back to the SQS to make a RecieveMessage Request to pick another available task from the queue.

SQS can also have an autoscaling group that monitors SQS and automatically scales based on the number of messages in the queue and trigger more provisioning of instances.

Type of Amazon SQS Queue

SQS offers two types of message queues:
Standard queues offer maximum throughput, best-effort ordering, and at least-once delivery.

FIFO queues are designed to guarantee that messages are processed exactly once, in the exact order that they are sent.

Benefits of Amazon SQS

◼Durability — For the safety of your messages, Amazon SQS stores them on multiple servers. Standard queues support at least one message delivery, and FIFO queues support exactly one message processing.

◼Availability — Amazon SQS uses redundant infrastructure to provide highly concurrent access to messages and high availability for producing and consuming messages.

◼Scalability — Amazon SQS can process each buffered request independently, scaling transparently to handle any load increases or spikes without any provisioning instructions.

◼Reliability — Amazon SQS locks your messages during processing so that multiple producers can send and multiple consumers can receive messages at the same time.

◼Customization — Your queues don’t have to be exactly alike — for example, you can set a default delay on a queue. You can store the contents of messages larger than 256 KB using Amazon Simple Storage Service (Amazon S3) or Amazon DynamoDB, with Amazon SQS holding a pointer to the Amazon S3 object, or you can split a large message into smaller messages.

Now, it's time to discuss the main task, in which I'm writing a case study of an organization named Oyster that uses Amazon SQS behind the scene.

About Oyster

New York-based Oyster.com shares unvarnished reviews of hotels in nearly 200 destinations worldwide. The company’s own investigators visit each location to assess cleanliness, amenities, service and overall quality. What sets Oyster apart from similar sites is its extensive collection of photographs. Oyster takes hundreds of photos for each property, and every review includes dozens of untouched images (submitted by guests as well as investigators) that allow potential visitors to compare a hotel’s marketing material with reality.

The Challenge

Since its 2009 launch, Oyster has published more than one million high-quality digital images. When this massive volume of images became too cumbersome to handle in-house, the company decided to offload the content to a central repository on Amazon Simple Storage Service (Amazon S3). “We migrated to Amazon S3 in 2010,” says Eytan Seidman, Co-Founder and Vice President of Product. “We chose moving to the cloud and Amazon S3 because storing images in our data center would have been too costly. Amazon S3 was a more economical solution.”
Oyster reprocesses its entire collection of photographic images a few times each year to update the copyright year and, if necessary, to change the watermarks. Using their previous solution, reprocessing the entire collection of photographs required about 800 hours to complete. In addition, Oyster often recreated existing images in new formats and sizes for mobile and tablet devices. Resizing existing images and adding new ones was slowing down the rate at which the company was able to process the collection. “Our processes were slowing down,” says Seidman. “When the iPad with Retina display came out, for example, it took us more than a week to create new sizes specifically for that resolution.” Oyster considered purchasing additional hardware but found the cost of new hardware and routine maintenance was too high, especially when the machines would sit idle most of the time.
Moreover, there were numerous software bugs in the multiprocessing solution that the company used, but since the solution didn’t scale, Oyster didn’t bother to fix them.

Why Amazon Web Services?

“We were already using Amazon S3 to store the images, so using Amazon Elastic Compute Cloud (Amazon EC2) to process the images was a natural choice,” Seidman says. Chris McBride, a software engineer at Oyster, adds, “We wanted a cloud environment that could be ramped up for the large processing jobs and downsized for the smaller daily jobs.”

While the company is still running one local server, the bulk of the processing work now takes place on the AWS Cloud. Oyster is using a customized Amazon Linux AMI within Amazon EC2. Within this new environment, the company connects to Amazon S3 and Amazon Simple Queue Service (Amazon SQS) using boto, a Python interface to AWS. The images themselves are processed with the ImageMagick software available in the AMI package.

Oyster uses Amazon EC2 instances and Amazon SQS in an integrated workflow to generate the sizes they need for each photo. The team processes a few thousand photos each night, using Amazon EC2 Spot Instances. When Oyster processes the entire collection, it can use up to 100 Amazon EC2 instances. The team uses Amazon SQS to communicate the photos that need to be processed and the status of the jobs.

The Benefits

Oyster’s old system needed approximately 400 hours to process one million photos. By using AWS, the company can process the same number of photos in about 20 hours — a 95 percent improvement.

Oyster has also been able to reduce in-house hardware expenses by repurposing two of its old servers, which were sitting idle more than 80 percent of the time. “As per their estimation, the company saved roughly $10,000 in capital expenditures by moving to AWS and reduced their operating expenses by an additional $10,000. AWS let them move faster without worrying about machine expenditures or maintenance, which frees them to focus on other things.

Thank You!

Keep Learning & Sharing…

If this article is useful for you then don’t forget to press the clap 👏 icon and also follow me on medium for more such amazing articles.

Leave a comment if you have any doubts or you can connect me on LinkedIn.

--

--

No responses yet