Tuesday, June 12, 2018

Using HTTP Traffic Stream and Kinesis Analytics to identify abnormality.

This blog is from a project I am working recently. In order to protect some sensitive data, some content will be changed, however, the case is a real case in our production.

Background and motivation

The internet is a dangerous place, for our company, some malicious users try to call an API by a script for some benefits. The company lost money on this and the system waste resource on these requests.
Usually, the solution is based on analyzing the ELB/ALB access log, however, the format of the ELB/ALB access log is as follows.
timestamp elb client:port backend:port request_processing_time backend_processing_time response_processing_time elb_status_code backend_status_code received_bytes sent_bytes "request" "user_agent" ssl_cipher ssl_protocol

As the format says, the information about the client for each request is very general. Based on the information provided by the access log, people usually end up blocking IP or CIDR addresses. Blocking IP is problematic since the attacker could change the IP easily and an IP could be shared one.
Besides, the latency is unpredictable, at first, the ELB/ALB needs to push the log to S3 after that you need to step a Lambda function to retrieve the  S3 file. However, your Lambda function will only be triggered after the file is created, it is not real-time.

Switch to an information-rich data source

As the previous blog says, I am working on the API Gateway service for the whole company these days. One purpose of this project is to stream all HTTP Stream to one place so other engineers could build their services based on this stream. 
For each HTTP request, we generate an event and send it to Kinesis. 
This is the JSON format for one HTTP request and the corresponding response.
As the JSON object shows, the information is much rich than the ELB/ALB access log, it has everything about an HTTP request and we can add more fields to this object such as response time, etc..

Analyzing the data using Kinesis Analytics

After streaming all the data we need, we can use Kinesis Analytics to generate the output we need.

Let's assume there are some attackers try to brute force the API http://login.company.com/login and the authorization is unique for each user. On this API we put a constraint of 5 times per minute for each user.  So we can easily write a simple SQL in Kinesis Analytics application:

The BRUTE_FORCING_DETECTION_STREAM is the output stream and we can connect it to a Lambda function, the function looks like this:

This Lambda function will be executed in every one second, and once there is an event in the BRUTE_FORCING_DETECTION_STREAM, the Lambda will receive the authorization header and call API Gateway Service to block this authorization header. The data flow likes this:

The latency

The key to identifying this kind of malicious user is the latency. If it takes 10 minutes to block a user, he may already finish his job. For testing, I wrote a simple program to access this API with fix interval.

After a while, the program stopped and this was the output
Wrong:{"errorCode":"ZUUL-EDGE-0001","operationMessage":"Your request has been throttled.","message":"Your request has been throttled.","fields":null}, execute 11 requests.
Executing 112 second, 11 requests

So, this "attacker" succeeded 11 times before being blocked for 2 minutes. Our API Gateway service pulls configuration every one minute, so actually, it only took the Kinesis Analytics 1 minute to spot this "attacker" which was exactly the length of our sliding window.

The reusable stream data

As stated before, we don't want to solve problems case by case, we want to deliver something general enough to be reused in the system. The HTTP traffic stream is an example. This is just one use case and based on the data and the other AWS services the potential is huge. For example, this is an article to show how to build real-time hotspot detection for a TAXI-hailing company using the Kinesis Stream machine learning algorithm. 


In the blog, we show that with some in-house solutions combined with some AWS service, we are able to build a real-time system for monitoring our HTTP traffic for multiple use cases.


Post a Comment

Subscribe to Post Comments [Atom]

<< Home