Provisioning Elasticsearch and Kibana with Terraform

Fair warning up front: this is not a Terraform, AWS, or Elasticsearch tutorial. You'll need to know a bit or read the docs to apply the examples.

When I wanted to add the AWS version of ELK (Elasticsearch, Logstash, Kibana) which is Elasticsearch, Cloudwatch and Kibana, I hit a road block that Terraform did not support provisioning the actual streaming of logs from Cloudwatch to Elasticsearch naively. Googling lead me approximately nowhere, and I had to devise a solution from scratch.

Automating setting up Cloudwatch once hosts are set up is reasonably straightforward using e.g. Ansible, so it's the Lambda function to parse messages and send them to Elasticsearch that is the problem. It's super simple to set up if you follow the docs and click through the AWS console, but there are few hints on how you would automate it reliably.

Terraform supports setting up Cloudwatch log subscription filters and Elasticsearch clusters/domains as well as Lambda functions, and once you know what to put in the Lambda, the process is just plain old Terraform. While the solution is not very sophisticated, it works for me ^(tm) and I belive it might work for others as well.

As so often in software, the solution is to copy something that works. In this case, I created a subscription filter from some random Cloudwatch group to a freshly created Elasticsearch domain on a test account, and then checked the resulting (NodeJS) code for the generated Lambda function. To my delight, the code has almost no environment- or target specific contents except for the Elasticsearch endpoint itself, so there's not much that needs to be done to adapt it for any setup.

In this case, I simply replaced references to the endpoint with references to a environment variable ES_ENDPOINT, which I then inject when creating the lambda function.

Putting the resulting index.js (or whatever) into a zip file, we can then provision it using Terraform:


resource "aws_lambda_function" "logs-to-es-lambda" {
  filename         = "files/es_logs_lambda.zip"
  description      = "CloudWatch Logs to Amazon ES"
  function_name    = "Logs_To_Elasticsearch"
  role             = "${aws_iam_role.my-es-execution-role.arn}"
  handler          = "index.handler"
  source_code_hash = '${base64sha256(file("files/es_logs_lambda.zip"))}'
  runtime          = "nodejs4.3"
  timeout          = 60

  environment {
    variables = {
      ES_ENDPOINT = "${aws_elasticsearch_domain.my-es-domain.endpoint}"
    }
  }
}

With the Lambda function in place, we can create a log subscription filter to send events to it:


resource "aws_lambda_permission" "cloudwatch-lambda-permission" {
  statement_id = "allow-cloudwatch-lambda"
  action = "lambda:InvokeFunction"
  function_name = "${aws_lambda_function.logs-to-es-lambda.arn}"
  principal = "logs.${var.my-aws-region}.amazonaws.com"
  source_arn = "${aws_cloudwatch_log_group.my-log-group.arn}"
}


resource "aws_cloudwatch_log_subscription_filter" "logs-subscription" {
  name            = "ElasticsearchStream-logs"
  depends_on = ["aws_lambda_permission.cloudwatch-lambda-permission"]
  log_group_name  = "${aws_cloudwatch_log_group.my-log-group.name}"
  filter_pattern  = "[timestamp, level, thread, name, message]"
  destination_arn = "${aws_lambda_function.logs-to-es-lambda.arn}"
}

The filter pattern follows the same rules as in the AWS console, and I'd generally use the console and some example data to try it out since I find the documentation incredibly hard to find and parse.

For a full example, you'd of course need to provision the role (aws_iam_role), domain (aws_elasticsearch_domain) and stream some interesting logs to the log group! In addition, the first time I did this I hooked a quick Python hack to inject the endpoint and zip the JavaScript source on the fly, but I don't think that makes much sense in retrospect. The last couple of projects I've done this, I've simply committed the ZIP to the source repository instead.

comments powered by Disqus