04 Oct 2016, 10:40 Containers Splunk Logging Docker

Using Splunk in a Docker Container on Bluemix

One of the things that the IBM Bluemix Platform (based on Cloud Foundry) supports is logging to external logging providers. I was recently reading Takehiko Amano’s excellent article on Splunk integration with IBM Bluemix. Splunk is becoming very popular in the industry generally for log aggregation, indexing, and searching.

Takehiko’s solution is excellent, but still requires somewhere to deploy Splunk. However, Bluemix itself provides the IBM Containers offerings (based on Docker technology) where Splunk can be run. This probably isn’t suitable for robust production environments, but for quick ‘n’ dirty testing, it’s really useful. I’ve documented below some steps that you need to get this up and running with Splunk Light, which is the simpler, lighterweight edition of Splunk.

Prerequisites

Instructions for this tutorial are written assuming you are using OS X, although they can probably be adapted to other platforms fairly easily.

Build the Splunk Container

You need to build the Docker container for Splunk locally before pushing it up to the Bluemix containers repository. There’s already a well-established GitHub project for a Splunk Docker container, but we need to add the RFC5424 add-on as per Takehiko’s article to get Splunk to recognize the logging format.

I’ve already forked the GitHub repository and added most of the changes required to do that, but you will need to download the add-on itself first.

  • Open a terminal and clone my repository, checking out the bluemix branch:

    git clone -b bluemix https://github.com/andrewferrier/docker-splunk/
    
  • Download the RFC 5424 add-on. You will need to sign up for a free splunk.com ID if you don’t already have one. Put the .tgz file in the splunklight directory inside your checked-out git repository.

  • Now build the Docker image (which may take a little while):

    cd <your_checkout_directory>/splunklight
    docker build -t andrewferrier/splunk:latest-light .
    

(If you wish, you can substitute your own namespace prefix in place of andrewferrier - as long as you use it consistently below).

Push the Splunk Container up to Bluemix and start it running

Firstly, log into Bluemix and initialize the container runtime:

bx login
bx ic init

You will need to specify an organisation and space within which to work on Bluemix.

Next, double-check what your IBM Containers “namespace” is. If you’ve worked with Containers before, you’ve probably already got one specified. You can check it with bx ic namespace-get. If you haven’t, you’ll need to set one with bx ic namespace-set (I use andrewferrier, for example - but you can set it as anything that’s meaningful to you - it will have to be unique across all users using shared Bluemix).

Now, tag your built image to prepare it for upload to the remote registry:

docker tag andrewferrier/splunk:latest-light registry.ng.bluemix.net/andrewferrier/splunk:latest-light

(Note that the first andrewferrier above is the prefix we specified previously when we build the image. The second is the namespace on Bluemix itself as just discussed. If you want to work with the UK instance of Bluemix, rather than the US one, change all references to .ng. to .eu-gb.)

Now actually push the image to the remote registry (this may take a little while):

docker push registry.ng.bluemix.net/andrewferrier/splunk:latest-light

Now, we need to create some persistent volumes for both the /opt/splunk/etc and the /opt/splunk/var filesystems within the container:

bx ic volume create splunk-etc
bx ic volume create splunk-var

Start the container running. Notice that we are exposing two TCP ports, 8000 (which will be used over HTTP to access the Splunk console), and 5140 (which will be used to push syslog messages from Bluemix to Splunk).

bx ic create -m 1024 -p 8000 -p 5140 --env SPLUNK_START_ARGS="--accept-license" --volume splunk-etc:/opt/splunk/etc --volume splunk-var:/opt/splunk/var registry.ng.bluemix.net/andrewferrier/splunk:latest-light

Once the container has started running, the Bluemix CLI will print out the container ID. You typically only need the first few characters - enough to uniquely identify it (e.g. abc1234).

Now check which public IP addresses you have free to assign to the container:

bx ic ips

This should print a list of IPs (probably two if you are working with a trial Bluemix account) - pick any IP which is not assigned to a container (if you have no unassigned addresses, you’ll either need to pay for more or unbind one from an existing container first). Now bind that IP address to your newly-created container:

bx ic ip-bind 1.2.3.4 abc1234

Now you’ll need to create a user-provided service to stream the logs from your application(s) to Splunk:

bx cf cups splunk -l syslog://1.2.3.4:5140

Setting up a TCP listener within Splunk

Now we need to set up a data listener within Splunk to listen for data on TCP port 5140 (essentially, this is the same procedure as Takehiko’s original article).

Open the Splunk console in a browser using the URL http://1.2.3.4:8000 (obviously, change the IP address for the one you picked above). Log in using the default username/password pair admin/changeme (Splunk will then encourage you to immediately change the password, which you should).

On the home screen, click “Add Data” to add a data source:

Add Data

Select “Monitor”:

Select Monitor

Select “TCP/UDP” to add a TCP-based data listener:

Select TCP

Enter Port 5140 (the same port we exposed from the Splunk Docker container above):

Enter 5140

Select rfc5424_syslog as the source type (which corresponds to the Splunk add-on we installed previously). You may find it easiest to type rfc into the dropdown box to select this. Also, you may want to create a new index to index data from Bluemix. In this case, I’ve created one called bluemix:

Input Settings

Review the settings you’ve entered and add the data listener.

Clone and push a demo application

In this article, we’ll clone a sample Node.JS application locally and then push it to Bluemix, so we can bind it to the user-provided service we just defined to use it to test the Splunk integration.

cd <some_temporary_directory>
git clone https://github.com/IBM-Bluemix/get-started-node
cd get-started-node
curl https://new-console.ng.bluemix.net/get-started/docs/manifest.yml > manifest.yml

Now edit manifest.yml to change name and host to a unique name (e.g. TestAppForSplunkAF (note that this name must be unique within the whole of Bluemix, which is why I use my initials to make this unique).

You also need to modify lines of the server.js file to look like this:

var port = process.env.VCAP_APP_PORT || 8080;

(This ensures that the application will pick up the correct port number from the Bluemix deployed environment).

Now push the application up to Bluemix:

bx cf push

Bind that service to any application you wish:

bx cf bind-service TestAppForSplunkAF splunk

And restage each application:

bx cf restage TestApp

Testing the logging mechanism

Probably, just in the act of restaging your application, you’ll already have generated some logs. However, to make things a bit more interesting, open the endpoint for your application (e.g. http://testappforsplunk.mybluemix.net/ - or similar, modify for the name of your application!) in a browser, and refresh it a few times.

Now, you should start to see your logging information appearing through Splunk. Assuming you set Splunk up as shown above, and created a new non-default index called bluemix, you should simply be able to search for everything in the bluemix index:

Search Terms

You should see some search results appear like this:

Search Results

Further Steps

The world is now your Oyster! You can use any standard Splunk searching mechanism to find logs.

Any questions or comments, contact me at andrew DOT ferrier AT uk DOT ibm DOT com.