Monday, December 28, 2015

Camel CXF REST API and Active MQ running on JBoss EAP

If you are looking for an full sample application that uses Camel CXF REST APIs and Active MQ, running on JBoss EAP then look no further. Clone this Github project 
This project uses Camel CXF REST API and Active MQ. This project is created to run as a web archive on JBoss EAP server.
It contains all the configurations required to use following components:
  • Camel CXF REST API
  • Apache MQ
  • configuration using Spring XML schema
  • CamelSpringTestSupport Unit testing framework

brew update fails on Mac OS 10.11.x EL Capitan - Solution

If you are getting error like:

error: unable to unlink old '.gitignore' (Permission denied)

error: unable to create file .travis.yml (Permission denied)
...

then you need to do following.

use command brew doctor to get a list of potential issues and solutions

$brew doctor

For this particular issue, the solution was to change ownership on /usr/local

$sudo chown -R $(whoami) /usr/local

That is it. Now brew update should work!

Friday, September 11, 2015

Nginx web server with Varnish cache

First let us understand about Nginx and Varnish:

Nginx (pronounced "engine x") is a web server with a strong focus on high concurrency, performance and low memory usage. It can also act as a reverse proxy server for HTTPHTTPSSMTPPOP3, and IMAP protocols, as well as a load balancer and an HTTP cache.
Nginx can be deployed to serve dynamic HTTP content on the network using FastCGI, SCGI handlers for scripts, WSGI application servers or Phusion Passenger module, and it can serve as a software load balancer.
Nginx uses an asynchronous event-driven approach to handling requests, instead of the Apache HTTP Server model that defaults to athreaded or process-oriented approach, where the Event MPM is required for asynchronous processing. Nginx's modular event-driven architecture can provide more predictable performance under high loads.

Varnish Cache is a web application accelerator also known as a caching HTTP reverse proxy. You install it in front of any server that speaks HTTP and configure it to cache the contents. Varnish Cache is really, really fast. It typically speeds up delivery with a factor of 300 - 1000x, depending on your architecture.

Install & configure
Nginx can be installed on MAC using brew.
$brew install nginx
will install Nginx for you.
Then go to /usr/local/etc/nginx/nginx.conf to configure the server and location settings. This is the simplest of the settings.
server {
        listen       8080;
        server_name  localhost;
        location / {
            root   html;
            index  index.html index.htm;
        }
}
Start up Nginx:
bash-3.2$ cd /usr/local/Cellar/nginx/1.8.0/bin
bash-3.2$ nginx
bash-3.2$ ps -ef | grep nginx

Varnish can be installed on MAC using brew.
$brew install varnish
Configure default.vcl file for varnish at /usr/local/Cellar/varnish/4.0.3/etc/varnish
Minimum setting required is:
vcl 4.0;
backend default {
    .host = "127.0.0.1";
    .port = "8080";
}

Startup Varnish with a command like:
bash-3.2$ sudo sbin/varnishd -a :6081 -T localhost:6082 -f etc/varnish/default.vcl -s malloc,512m

Here -a option is where varnish will listen for traffic.
-T is for the admin url
-f to give vcl file


Run Varnish with Nginx

With this setup, request sent to http://localhost:6081/ will hit Varnish, if the requested resource is cached then varnish will return it with HTTP 304 Status code, otherwise it will forward the request to nginx listening at port 8080 and it would return the response with HTTP status 200.
Reference:

  • Nginx - https://en.wikipedia.org/wiki/Nginx
  • Varnish - https://www.varnish-cache.org/about

Tuesday, September 8, 2015

Varnish on Windows (using Cygwin)

This post details how to run Varnish on Windows (using Cygwin) .

Install Varnish and Cygwin on Windows

To install Varnish and Cygwin on Windows, refer to download and installation instructions at:
https://www.varnish-cache.org/trac/wiki/VarnishOnCygwinWindows#InstallFullCygwinenvironmentwithvarnishpackage

Configure default.vcl file

  • Take a copy of default.vcl file located at {cygwin_install}/etc/varnish
  • Now open default.vcl and configure backend default
I am giving port 3000 because it is the default port for node.js

backend default {
     .host = "127.0.0.1";
     .port = "3000";
 }


Start up Varnish

On cygwin window, type following command to startup varnish
$ /usr/sbin/varnishd -d -f ./etc/varnish/default.vcl -a 127.0.0.1:6081

This will start up Varnish in debug mode, so that you can monitor it.

Once started, list the processes
$ ps -ef | grep varnish

you should see varnish process started.

Issues with Varnish on Cygwin and solution:

Panic message: Assert error in sock_test(), /usr/src/varnish-3.0.5-1/src/varnish-3.0.5/bin/varnishd/cache_acceptor.c line 166:

  Condition(l == sizeof tcp_nodelay) not true.

Solution:
You will need to download the varnish src (varnish-3.0.5-1-src.tar.xz) and remove the assert statement at varnish-3.0.5/bin/varnishd/cache_acceptor.c line 166

Then recompile and install:
  • cygport varnish-{version}.cygport prep
  • cygport varnish-{version}.cygport compile
  • cygport varnish-{version}.cygport install

More details for compile at - http://sourceforge.net/projects/cygvarnish/files/cygport-packages/64%20bits%20%28x86_64%29/


Monday, June 29, 2015

Hadoop with Cloudera + Maven project in IntelliJ

How to write Hadoop MR programs using IntelliJ

We will use Cloudera distributions and use maven to define all the dependencies

Though I am using IntelliJ, you could use Eclipse and create a similar project.

1. Create new Project in IntelliJ

You can use maven archetype to create this project. Use the following archtype
archetypeGroupId=org.apache.maven.archetypes
archetypeArtifactId=maven-archetype-quickstart

Use atleast Java 7. This will generate a maven project with pom.xml

2. Add Cloudera repo in the pom.xml


<repositories>
      <repository>
        <id>cloudera-releases</id>
        <url>https://repository.cloudera.com/artifactory/cloudera-repos</url>
        <releases>
          <enabled>true</enabled>
        </releases>
        <snapshots>
         <enabled>false</enabled>
        </snapshots>
      </repository>
    </repositories>


3. Add Hadoop dependencies for writing a client project


<dependency>
      <groupId>org.apache.hadoop</groupId>
      <artifactId>hadoop-client</artifactId>
      <version>${hadoop.version}</version>
    </dependency>


At time of writing this article, the latest Cloudera Hadoop Client version is 2.6.0-mr1-cdh5.5.0-SNAPSHOT

You might also want to add Maven Central Repo
http://repo1.maven.org/maven2
4. Now try and build the project.
The project should build and you should have all the dependencies downloaded.
Now you are good to go.. write all the MapReduce programs you know :)

Tuesday, June 9, 2015

HTTP Headers useful for REST APIs

Access-Control-* headers

preflight - OPTIONS

When writing a simple CORS filter following fields can be set in response header:

response.setHeader("Access-Control-Allow-Origin", "*");
response.setHeader("Access-Control-Allow-Methods", "POST, GET, OPTIONS, DELETE");
response.setHeader("Access-Control-Max-Age", "3600");
response.setHeader("Access-Control-Allow-Headers", "x-requested-with");

CORS filter containing above code will respond to all requests with these Access-Control-* headers.

Access-Control-Allow-Origin - tell to allow all origins. 
Access-Control-Allow-Methods - HTTP methods that will be allowed
Access-Control-Allow-Headers - give header keys that must be allowed in response for CORS calls. This tells that x-requested-with is allowed as a response header when CORS call is made.


HTTP Media type headers

(http://www.newmediacampaigns.com/blog/browser-rest-http-accept-headers)
When a web browser make a request it sends information to the server about what it is looking for in headers. One of these headers is the Accept header. The Accept header tells the server what file formats, or more correctly MIME-types, the browser is looking for.
Accept:text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8

 Quality factors allow the user or user agent to indicate the relative degree of preference for that media-range, using the qvalue scale from 0 to 1 (section 3.9). The default value is q=1.

Order by preference value in descending order.

1: html, xhtml

0.9: xml

0.8: *

 For example if both application/xml and */* had a preference of 0.9 application/xml would still come first. Firefox chooses to make it explicit that */* is less preferred by giving it a preference of 0.8. Firefox's Accept header is sensible and well thought out. Opera's is too. Other browsers: not so much.

Twitter's REST API doesn't use the Accept header for content-negotiation, they use extensions on the URL '.json' and '.xml'.

==============

When Content-Type is null or wrong then rest service returns 

415 Unsupported Media Type - 
"message":"Content type 'null' not supported"
"message":"Content type 'application/vnd.dmi-v2+xfd' not supported"

When Content-Type is correct, but Accept header is wrong then service returns
406 Not Acceptable 

HTTP Caching headers

Works only for safe HTTP methods - GET, HEAD, OPTIONS
HTTP 304 Not modified returned if response sent from cache

Request headershould matchResponse headerE.g. value
If-Modified-Since=Last-ModifiedHTTP-date (Sat, 29 Oct 1994 19:43:31 GMT)
If-None-Match=ETag"123-abef8r3dw"

 In 200 (OK) responses to GET or HEAD, an origin server:

   o  SHOULD send an entity-tag validator unless it is not feasible to
      generate one.

   o  MAY send a weak entity-tag instead of a strong entity-tag, if
      performance considerations support the use of weak entity-tags, or
      if it is unfeasible to send a strong entity-tag.

   o  SHOULD send a Last-Modified value if it is feasible to send one.

   In other words, the preferred behavior for an origin server is to
   send both a strong entity-tag and a Last-Modified value in successful
   responses to a retrieval request.

   A client:

   o  MUST send that entity-tag in any cache validation request (using
      If-Match or If-None-Match) if an entity-tag has been provided by
      the origin server.
   o  SHOULD send the Last-Modified value in non-subrange cache
      validation requests (using If-Modified-Since) if only a
      Last-Modified value has been provided by the origin server.

   o  MAY send the Last-Modified value in subrange cache validation
      requests (using If-Unmodified-Since) if only a Last-Modified value
      has been provided by an HTTP/1.0 origin server.  The user agent
      SHOULD provide a way to disable this, in case of difficulty.

Useful Cache-Control response headers include:
  • max-age=[seconds] — specifies the maximum amount of time that a representation will be considered fresh. Similar to Expires, this directive is relative to the time of the request, rather than absolute. [seconds] is the number of seconds from the time of the request you wish the representation to be fresh for.
  • s-maxage=[seconds] — similar to max-age, except that it only applies to shared (e.g., proxy) caches.
  • public — marks authenticated responses as cacheable; normally, if HTTP authentication is required, responses are automatically private.
  • private — allows caches that are specific to one user (e.g., in a browser) to store the response; shared caches (e.g., in a proxy) may not.
  • no-cache — forces caches to submit the request to the origin server for validation before releasing a cached copy, every time. This is useful to assure that authentication is respected (in combination with public), or to maintain rigid freshness, without sacrificing all of the benefits of caching.
  • no-store — instructs caches not to keep a copy of the representation under any conditions.
  • must-revalidate — tells caches that they must obey any freshness information you give them about a representation. HTTP allows caches to serve stale representations under special conditions; by specifying this header, you’re telling the cache that you want it to strictly follow your rules.
  • proxy-revalidate — similar to must-revalidate, except that it only applies to proxy caches.
When both Cache-Control and Expires are present, Cache-Control takes precedence.


Essentially, "vary" lets the caches know which of the headers to use to figure out if they have a valid cache for a request; if a cache were a giant key-value store, adding "vary" fields appends those values to the key, thus changing which requests are considered valid matches for what exists in the cache.

What is the correct way to version my API?
The "URL" way
A commonly used way to version your API is to add a version number in the URL. For instance:/api/v1/article/1234/api/v2/article/1234GET /api/article/1234 HTTP/1.1Accept: application/vnd.api.article+xml; version=1.0To "move" to another API, one could increase the version number:The hypermedia way

References 
http://restcookbook.com/Basics/versioning/#sthash.jRlZVJ0L.dpuf

https://www.safaribooksonline.com/library/view/rest-api-design/9781449317904/ch04.html


https://devcenter.heroku.com/articles/jax-rs-http-caching

Connect to Cloudera VM installed on your MAC using MAC terminal

So recently started working on Cloudera Hadoop. And installed Cloudera VM from their website. I am using Virtual Box for running the Cloudera Hadoop VM.

Now working on VMs is slow and tedious. You can cofigure your mac to have Mac's terminal ssh to the VM.

Here are the steps:
1. Go to Virtual Box setting for the Cloudera VM and 
open virtualbox
go to File-->Preferences-->Network and click on the "Add Host-only network (Ins)
it will create automatically a "vboxnet0" network
Click to OK to save changes


2. Now on a terminal window of Cloudera Hadoop VM, write ifconfig.
This will return you information with eth0, eth1 settings.
Get the inet addr associated with eth1
3. Go to you Mac's terminal window and type
$ ssh  training@<inet addr ip from prev step>
4. This will ask you to connect to the VM. Type "yes"
and you are in. You can check by typing at prompt - hadoop fs.
If this run's then you are connected.

-connect to cloudera vm using mac
-how to connect to cloudera vm from terminal