Building a Proper REST API with authentication


First thing we need to be clear about is your method of authentication means nothing if you are not using https. If you are not using https then your users credentials can be snooped.

What is REST?

TL;DR Cheat sheet

Representation State Transfer

Resource based

we are talking about thing(nouns) instead of actions(verbs)

resource is identified by the URI

The representation is not the resource, it is just a representation.


How resources get manipulated.

Part of the resource state.



Resource: person

Service: contact_info GET

Representation: name, address, phonenumber (JSON/XML)

6 Constraints

Violating any of these except the optional code on demand, means your API is not strictly RESTful

Uniform Interface

A consistent interface between client and server


Uri’s: Resource names

HTTP Response: status, body


Server contains no client state, each request has enough context for server to process in isolation

If there is state, it is kept on the client side


Assume a disconnected system

Separation of concerns


Server response for representations


Explicit – server specifies

Negotatiated – client and server negotiate

Layered System

Client does not assume direct connection

You don’t know where or how you are getting the data


Code on Demand

Server can temporararily extend a client, transfer logic to client

Client executes logic

An optional constraint

Rest API Allows

  • Scalability
  • Simplicity
  • Modifiability
  • Visibility
  • Portability
  • Reliability


HTTP Verbs similarity with CRUD





Use URL not query string

  • Good: /users/12345
  • Poor: /api?type=user&id=23

Design for your clients not your data

Use Plurals for consistency


  • Recommended: /customers/33245/orders/8769/lineitems/1
  • Not: /customer/33245/order/8769/lineitem/1

Use the correct HTTP Status Code Responses

Offer JSON and XML

Use hypermedia links

A key concept that is central to the idea of what REST really is.

Hypermedia links (HATEOS)  or Hypermedia as the Engine of Application State make services more discoverable and self-descriptive

The client needs no prior knowledge about resources etc.

Documentation should not be a requirement to understand the API

So you can browse an API just like browsing the web <- that is restful

Client should not need to know how to interact with the data (hardcoded urls / resource names), the server should know this

So you can allow the html media type and you can really browse the api like you do the web

Effort required increases

Likely Requests and Responses

A list of HTTP methods and responses

Naming Resources

Use nouns

They should be predictable

Choose for clients not your data

Plurals: it is a debate but rather always user them


Basically an action can be applied to an object multiple times but applying it more than once will not change the state or result of it’s application.

Eg. Getting a cow pregnant

A GET never changes data so it is idempotent (Safe method)

PUT is idempotent as it updates an object with the same data, will return the same result

DELETE is idempotent, it will return a 404 – NB. It is better to mark for deletion instead of actually deleting

POST is NOT idempotent, as for every new POST there is a new different result




What is REST tutorial

Oauth2 vs Json Web Tokens

CPU usage high…is there a bot flooding you with requests?

CPU Usage High Is there a Bot at work?

Alright something is going on, we are getting new relic alerts.


As you can see something strange started on the 5th of October. Check Under server -> Apps on New relic and see what is the application that is causing the High CPU Usage. In this case it was apache at 80.4% CPU usage.

The first port of call would be the /var/log/apache/access.log and if you tail -f the log you will see the frequency of requests. An example is shown below. - - [14/Oct/2016:12:10:45 +0000] "GET /products/new-products-category.html?color=white,ivory,brown,beige,natural&dir=desc&order=position HTTP/1.1" 403 587 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +" - - [14/Oct/2016:12:10:47 +0000] "GET /products/grass-products-category.html?color=pink,blue,natural&dir=desc&order=position&p=2 HTTP/1.1" 403 585 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +" - - [14/Oct/2016:12:10:46 +0000] "POST /index.php/autoassign/adminhtml_api/index/key/1be7da78e61ac5c2f9f7ed4e16084a22/?isAjax=true HTTP/1.1" 200 744 "" "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/53.0.2785.143 Safari/537.36" - - [14/Oct/2016:12:10:48 +0000] "GET /product-decor/mosaic-listellos-category.html?color=gold,red,black HTTP/1.1" 403 595 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +" - - [14/Oct/2016:12:10:50 +0000] "GET /products/stonewall-dabbing-category.html?color=black,blue,light-brown&dir=desc&order=position HTTP/1.1" 403 592 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +" - - [14/Oct/2016:12:10:51 +0000] "GET /products/new-products-category.html?color=natural,grey,beige,white,brown&dir=desc&order=name HTTP/1.1" 403 587 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +" - - [14/Oct/2016:12:10:53 +0000] "GET /product-decor-category.html?color=gold,white,ivory,pink,grey HTTP/1.1" 403 578 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +" - - [14/Oct/2016:12:10:54 +0000] "GET /products/new-products-category.html?color=grey,ivory,natural,brown,beige&dir=desc&order=name HTTP/1.1" 403 587 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +" - - [14/Oct/2016:12:10:56 +0000] "GET /products/stonewall-dabbing-category.html?color=white,bronze,terracotta&dir=desc&order=position HTTP/1.1" 403 592 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +" - - [14/Oct/2016:12:10:57 +0000] "GET /product-decor-category.html?color=gold,red,white,black,pink,grey&dir=asc&order=name HTTP/1.1" 403 578 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +"

What else can we do to check bots

You can also use:

To check the number of requests in that many seconds.

tail -n 500 /var/log/apache2/access.log | cut -d' ' -f1 | sort | uniq -c | sort -gr

First Steps

The first thing to do is install fail2ban on the server and configure it for apache.

We have found the Issue

There is an ip:

Is making lots of requests to pages that request a lot of processing. What is more is it says it is Googlebot/2.1; +…Yeah right.

Solving it

First Port of Call

Block the ip with .htaccess:

Order Deny,Allow
Deny from

Next Steps the automatic solution

So for the complete solution we need to block IP’s that are making more than 300 GET requests in 300 seconds. Note you should change this based on your criteria.

Add this to jail.local:

enabled = true
port = http,https
filter = http-get-dos
logpath = /var/log/apache2/access.log
maxretry = 300
findtime = 300
#ban for 5 minutes
bantime = 600

This will check your apache access log and apply the http-get-dos filter to it.

In the filter.d directory do the following:

Do vim http-get-dos.conf:

then add the following in there:

# Fail2Ban configuration file
# Author:

# Option: failregex
# Note: This regex will match any GET entry in your logs, so basically all valid and not valid entries are a match.
# You should set up in the jail.conf file, the maxretry and findtime carefully in order to avoid false positives.

#failregex = ^ -.*GET.*/ip\.cgi
failregex = ^ -.*"(GET|POST).*

# Option: ignoreregex
# Notes.: regex to ignore. If this regex matches, the line is ignored.
# Values: TEXT
ignoreregex =

Yeah so this should do the trick. I have found that if you specify an action, it won’t actally block that ip.

I will update with results.

The .htaccess change seems to have done the trick:


When CPU usage was low, that was when the .htaccess was edited.

Turns out it is a REAL google bot

To check if the bot is a real google bot check this link. Strange that it is spamming us silly.