July 16, 2019

Secure Logging for Kubernetes

Getting Started

I wanted to walk the community through deploying the Open Distro for Elasticsearch (ODfE) in Kubernetes as the core logging provider. I had a use case for Secure Logging in k8s, and I wanted to share my pain and struggled with the community so others can learn from my pain and suffering. I am calling it the ODfEFK, the Open Distro for Elasticsearch, Fluent and Kibana, which are all the required components for secure logging within k8s. I'm not great with acronyms, but it's an absurd mouthful that makes me laugh, so why not?

It's important to note that while I am using vanilla k8s (1.14 at the time of publishing), I am using the Cloud Foundry Container Runtime, kubo, which is a sustainable methodology for deploying and operating k8s deployments over time. The goal of this walkthrough is really about highlighting a sustainable set of configurations that's both a bit modular as well easy to work with. While you can use managed k8s deployments, like AKS or VKE, I will be focusing on deploying and managing on top of kubo as that's what I have available to me.

As a key point, if you just want a safe and secure Elasticsearch deployment without anything else, follow all the steps leading up to and including the Elasticsearch deployment and then stop there. You'll have a fully functioning and secure Elasticsearch deployment at that point.

Requirements

Before we get started, there are a couple things you'll need to figure out.

  1. You need a bosh (re: BOSH) deployment. I am using vSphere but you can use whatever bosh supports.
  2. You need a Credhub instance. The one deployed with bosh will suffice.
  3. You need a persistent storage provider. I am using NFS in this example, but any k8s-compatible persistent storage provider should do.
  4. You need to deploy kubo. It's really easy, you can find a set of instructions here. See the Appendix for how I deployed mine.
  5. You must be able to run privileged containers. This will not work without it.

Once you have those items ready to go, we'll go over the high level stuff.

Order of Operations

If you're like me, you've skipped this part, reading only the code sections, because you definitely know what you need. When you come back to this part, you'll read my disclaimer: you need to follow these steps in this particular order or it might break. The deployment dependency tree for managing this system is extensive, and if you don't follow the steps in order, you will spend an inordinate amount of time drunkenly working your way through these configurations as I did.

  1. Create a Certificate Authority
  2. Create transport and HTTP certificates.
  3. Create the Namespace.
  4. Upload the certificates.
  5. Draw the rest of the Owl.

There are some extra steps that may be necessary:

  • Crying on your keyboard because you tried using the OpenSSL CLI.
  • Deleting the namespace because you uploaded the ConfigMaps 3 times but you can't figure out what namespace you put them in.
  • Starting over.

This will take a lot of patience because there are a lot of moving parts.


Create a Certificate Authority

The first part of getting things started is creating a certificate authority. The reason we need a certificate authority is because this will be a semi-private deployment, in that everything in the cluster will be secured and not have external resolution, so you'll have to create a certificate keypair dedicated to this. If you're like me and you don't have $250,000 lying around for a cross-signed root certificate, then you need to generate a certificate authority.

This is actually the easiest part of the whole process. Assuming you've got Credhub configured and ready to go, you can run a command similar to this:

credhub generate \
    --type certificate \
    --is-ca
    --name /aperture/root-ca \
    --organization "Aperture Analytics" \
    --organization-unit "Engineering" \
    --country "US" \
    --state "CO" \
    --locality "Boulder"

That will give you a sane certificate authority that you can generate more certificates from. Based off my knowledge, unless you pass the --self-signed flag during generation, this is technically an intermediate CA, which will make our certificates leaf certificates, but that's okay. Either self-signed or intermediate is fine. The root certificates can be found with credhub get -n /aperture/root-ca.


Create the Certificates

This step took me a very long time, because I took a sidestep and decided to try my hand at the OpenSSL CLI. That did not go well. Neither did trying to use cfssl, which is a great tool, but I drove that struggle bus until it crashed into a tree.

So let's generate the certificate:

credhub generate --type certificate \
    --name /aperture/elasticsearch \
    --ca /aperture/root-ca \
    --common-name "elasticsearch" \
    --organization "ApertureAnalytics" \
    --organization-unit "Engineering" \
    --country "US" \
    --state "CO" \
    --locality "Boulder" \
    --alternative-name "es-cluster-0.elasticsearch.kube-logging.svc.cluster.local" \
    --alternative-name "es-cluster-1.elasticsearch.kube-logging.svc.cluster.local" \
    --alternative-name "es-cluster-2.elasticsearch.kube-logging.svc.cluster.local" \
    --alternative-name "es-cluster-3.elasticsearch.kube-logging.svc.cluster.local" \
    --alternative-name "es-cluster-4.elasticsearch.kube-logging.svc.cluster.local" \
    --alternative-name "es-cluster-5.elasticsearch.kube-logging.svc.cluster.local" \
    --alternative-name "es-cluster-6.elasticsearch.kube-logging.svc.cluster.local" \
    --alternative-name "*.elasticsearch.kube-logging.svc.cluster.local" \
    --alternative-name "elasticsearch.kube-logging.svc.cluster.local" \
    --alternative-name "elasticsearch.kube-logging" \
    --alternative-name "elasticsearch" \
    --ext-key-usage "server_auth" \
    --ext-key-usage "client_auth"

Let's work this a bit. First, we set the internal reference name, /aperture/elasticsearch, which is how we'll grab the values later. Once we've done that, we reference the /aperture/root-ca so we can ensure that's our root, and sign it with our common name, elasticsearch. While YMMV, I ran into a frustrating amount of problems with spaces in any of the subjects, so ensure there are no spaces or non-alphabet characters. You can put as many alternative names as you want, I put 7 so I can have up to a 7 node cluster, and then a wildcard in case I ever feel like changing the name of the StatefulSet.

If you are copying these directions with no variation on alternative names or don't want the wildcard, it's important to use what you see, because the name of the StatefulSet has to match the hostname (minus the training - and integer), which is es-cluster in this case. I added a whack of extra alternative names and wildcard just for safety and sanity, but you can definitely limit this to just the first 7 or so alternative names I use. It is important to note in ODfE, all nodes need all SANs, so however you go about configuring your networking domain, just know all nodes need all SANs.

It's important to note that ODfE requires the certificate have support for the serverAuth and clientAuth Extended Key Usage. Make sure to not miss those, or you will have TLS errors everywhere.


Create the Namespace

This is also an incredibly easy step, and for the most part, it's really hard to mess up. Save this file as 01-LoggingNamespace.yml.

kind: Namespace
apiVersion: v1
metadata:
  name: kube-logging

From here on out, we're going to go ahead and work exclusively in the kube-logging namespace. That ensures that everything we do for the logging subsystem we are deploying is segregated and easy to also view logs for.

Create it:

% kubectl apply -f 01-LoggingNamespace.yml
% kubectl describe ns/kube-logging
Name:         kube-logging
Labels:       <none>
Annotations:  kubectl.kubernetes.io/last-applied-configuration: {"apiVersion":"v1","kind":"Namespace","metadata":{"annotations":{},"name":"kube-logging"}}
Status:       Active

No resource quota.

No resource limits.

Upload the Certificates

So we created the certificates, but they reside in Credhub. Right now there's no CRD and controller for Credhub which can autogenerate credentials and certificates (hint hint), so for now our certificates reside purely in Credhub and are inaccessible by both ODfEFK and Kubernetes.

There exists, in my mind, a limitation of the Kubernetes Secrets provider in that you can't create a TLS secret which includes a root certificate through the k8s CLI. So, because of that, we need to upload the root certificate as a generic secret:

kubectl create secret generic elasticsearch-tls-ca \
    --from-file=ca.pem=<(credhub get \
        -j \
        -n /aperture/root-ca \
        | jq -r '.value.ca') \
    --namespace kube-logging

If you're wondering what's happening, I'll explain. We're creating a generic secret in the kube-logging namespace with a single file called ca.pem, which has the contents of the /aperture/root-ca certificate. The <() syntax is how you can redirect output from a command to a file reference in bash/zsh, so we're really just redirecting the raw output from the Credhub command as if it were coming from a file. I'd recommend running the Credhub command piece by piece if you don't quite understand it, because the next command will be that on steroids.

Once we've upload the root certificate, now we need to go through and upload the transport and HTTP keypair. For this, we can use a TLS secret, like so:

kubectl create secret tls elasticsearch-tls \
    --cert=<(credhub get \
        -j \
        -n /aperture/elasticsearch \
        | jq -r '.value.certificate') \
    --key=<(openssl pkcs8 \
        -v1 "PBE-SHA1-3DES" \
        -in <(credhub get \
            -j \
            -n /aperture/elasticsearch \
            | jq -r '.value.private_key') \
        -inform pem -topk8 -nocrypt) \
    --namespace kube-logging

Yes, you are seeing that correct: nested file referencing. The reason this command is so ridiculous is because ODfE requires the PKCS#8 v1.5 format for it's keys. It took me almost an hour of unhelpful googling before I stumbled across a very random blog which gave one reference to key conversion to PKCS#8 mentioning that you need to convert only the private key. I wish I had saved that reference, I'll update here if I find it again. The basic premise behind the certificate conversion is this is the certificate variant which the Java keystore understands, and since Credhub can't generate them in that format, we have to convert it before the upload.

While both of those commands may be a bit wonky, all they are really doing is ensuring we as humans are not responsible for copying and pasting raw certificate data. After both of those commands, you should have the certificates uploaded and it should look something like this:

% kubectl describe secrets/elasticsearch-tls-ca --namespace kube-logging
Name:         elasticsearch-tls-ca
Namespace:    kube-logging
Labels:       <none>
Annotations:  <none>

Type:  Opaque

Data
====
ca.pem:  1319 bytes

% kubectl describe secrets/elasticsearch-tls --namespace kube-logging
Name:         elasticsearch-tls
Namespace:    kube-logging
Labels:       <none>
Annotations:  <none>

Type:  kubernetes.io/tls

Data
====
tls.crt:  2160 bytes
tls.key:  1704 bytes

Elasticsearch

We're working on this in three separate ways: a ConfigMap to hold our elasticsearch.yml configuration, a headless Service to control the networking domain, and a StatefulSet to manage the deployment.

Configuration

Elasticsearch has a core configuration file called elasticsearch.yml. This configuration file controls everything that Elasticsearch needs in order to run in a desired configuration. As we are using the pre-built ODfE containers Amazon is publishing, we'll have no control over how Elasticsearch deploys directly. One could argue this isn't the way Amazon intended for it to be used, but I'm very lazy and this was easier than trying to figure out how to build my own containers.

Here is a good baseline elasticsearch.yml configuration. Save this file as 02-ElasticsearchConfigMap.yml.

apiVersion: v1
kind: ConfigMap
metadata:
  name: elasticsearch-config
  namespace: kube-logging
data:
  elasticsearch.yml: |
    cluster.initial_master_nodes: es-cluster-0,es-cluster-1,es-cluster-2
    cluster.name: k8s-logs
    cluster.routing.allocation.disk.threshold_enabled: false
    discovery.seed_hosts: es-cluster-0,es-cluster-1,es-cluster-2
    network.host: 0.0.0.0
    opendistro_security.allow_default_init_securityindex: true
    opendistro_security.audit.type: internal_elasticsearch
    opendistro_security.check_snapshot_restore_write_privileges: true
    opendistro_security.enable_snapshot_restore_privilege: true
    opendistro_security.nodes_dn:
      - "L=Boulder,O=ApertureAnalytics,ST=CO,C=US,OU=Engineering,CN=elasticsearch"
    opendistro_security.restapi.roles_enabled: ["all_access", "security_rest_api_access"]
    opendistro_security.ssl.http.enabled: true
    opendistro_security.ssl.http.pemcert_filepath: cert.pem
    opendistro_security.ssl.http.pemkey_filepath: key.pem
    opendistro_security.ssl.http.pemtrustedcas_filepath: ca.pem
    opendistro_security.ssl.transport.enabled: true
    opendistro_security.ssl.transport.enforce_hostname_verification: false
    opendistro_security.ssl.transport.pemcert_filepath: cert.pem
    opendistro_security.ssl.transport.pemkey_filepath: key.pem
    opendistro_security.ssl.transport.pemtrustedcas_filepath: ca.pem

Let's take a look at some of the sections, so it will make some more sense.

cluster.initial_master_nodes: es-cluster-0,es-cluster-1,es-cluster-2
cluster.name: k8s-logs
cluster.routing.allocation.disk.threshold_enabled: false
discovery.seed_hosts: es-cluster-0,es-cluster-1,es-cluster-2
network.host: 0.0.0.0

Here is where we're setting the core cluster configurations. We're stating the initial master nodes are es-cluster-{0,1,2}, which is the required quorum for our cluster. We are disabling the disk allocation decider because we are leveraging a shared NFS server, so it's all going to the same place, so who cares which node shards land on? We're also telling Elasticsearch to listen on any address and discover our seed hosts required to build the master node quorum.

opendistro_security.allow_default_init_securityindex: true
opendistro_security.audit.type: internal_elasticsearch
opendistro_security.check_snapshot_restore_write_privileges: true
opendistro_security.enable_snapshot_restore_privilege: true
opendistro_security.nodes_dn:
  - "L=Boulder,O=ApertureAnalytics,ST=CO,C=US,OU=Engineering,CN=elasticsearch"
opendistro_security.restapi.roles_enabled: ["all_access", "security_rest_api_access"]

Here's where we set some of the core ODfE security plugin settings. We're allowing internal security auditing (recommended for shared clusters), ensure we can create and restore snapshots through the security plugin-managed roles, set the expected node certificate DNs, and our REST API allowed methods.

While ODfE offers a configuration option, opendistro_security.ssl.transport.enabled, which theoretically let's you disable transport TLS, it's actually a lie. You cannot disable transport TLS between nodes. As the security plugin requires transport TLS, you are required to  register the expected node certificate Distinguished Names. Never fear, you don't have to build that DN yourself! There is an OpenSSL command which you can use as reference:

% openssl x509 \
    -subject \
    -nameopt RFC2253 \
    -noout \
    -in <(credhub get \
        -j \
        -n /aperture/elasticsearch \
        | jq -r '.value.certificate')

subject=L=Boulder,O=ApertureAnalytics,ST=CO,C=US,OU=Engineering,CN=elasticsearch

You can see the output of my certificate subject at the bottom of the command. Yours will look different (hopefully!), and then you can copy and paste that into the opendistro_security.nodes_dn array, just remember to quote it so it's read as a YAML string.

opendistro_security.ssl.http.enabled: true
opendistro_security.ssl.http.pemcert_filepath: cert.pem
opendistro_security.ssl.http.pemkey_filepath: key.pem
opendistro_security.ssl.http.pemtrustedcas_filepath: ca.pem
opendistro_security.ssl.transport.enabled: true
opendistro_security.ssl.transport.enforce_hostname_verification: false
opendistro_security.ssl.transport.pemcert_filepath: cert.pem
opendistro_security.ssl.transport.pemkey_filepath: key.pem
opendistro_security.ssl.transport.pemtrustedcas_filepath: ca.pem

These fields are mostly around transport and HTTP TLS configurations. Don't worry too much about there not being an absolute path, I'll explain how that works in a bit. These files directly relate to the secrets we created earlier.

Go ahead and create the ConfigMap:

% kubectl apply -f 02-ElasticsearchConfigMap.yml
% kubectl describe configmaps/elasticsearch-config --namespace kube-logging
Name:         elasticsearch-config
Namespace:    kube-logging
Labels:       <none>
Annotations:  kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"v1","data":{"elasticsearch.yml":"cluster.initial_master_nodes: es-cluster-0,es-cluster-1,es-cluster-2\ncluster.name: k8s-lo..."}

Data
====
elasticsearch.yml:
----
cluster.initial_master_nodes: es-cluster-0,es-cluster-1,es-cluster-2
cluster.name: k8s-logs
cluster.routing.allocation.disk.threshold_enabled: false
discovery.seed_hosts: es-cluster-0,es-cluster-1,es-cluster-2
network.host: 0.0.0.0
opendistro_security.allow_default_init_securityindex: true
opendistro_security.audit.type: internal_elasticsearch
opendistro_security.check_snapshot_restore_write_privileges: true
opendistro_security.enable_snapshot_restore_privilege: true
opendistro_security.nodes_dn:
  - "L=Boulder,O=ApertureAnalytics,ST=CO,C=US,OU=Engineering,CN=elasticsearch"
opendistro_security.restapi.roles_enabled: ["all_access", "security_rest_api_access"]
opendistro_security.ssl.http.enabled: true
opendistro_security.ssl.http.pemcert_filepath: cert.pem
opendistro_security.ssl.http.pemkey_filepath: key.pem
opendistro_security.ssl.http.pemtrustedcas_filepath: ca.pem
opendistro_security.ssl.transport.enabled: true
opendistro_security.ssl.transport.enforce_hostname_verification: false
opendistro_security.ssl.transport.pemcert_filepath: cert.pem
opendistro_security.ssl.transport.pemkey_filepath: key.pem
opendistro_security.ssl.transport.pemtrustedcas_filepath: ca.pem

Service

Now we need to create the headless Service to control the networking domain. This isn't anything terribly special, but a necessity. Save this to 03-ElasticsearchService.yml:

kind: Service
apiVersion: v1
metadata:
  name: elasticsearch
  namespace: kube-logging
  labels:
    app: elasticsearch
spec:
  selector:
    app: elasticsearch
  clusterIP: None
  ports:
    - port: 9200
      targetPort: 9200
      name: rest
    - port: 9300
      name: inter-node
    - port: 9600
      targetPort: 9600
      name: performance

We don't need a spec.clusterIP because the nodes will be directly communicating with each other, not through the service. We do expose ports 9200, 9300, and 9600 for mostly internal traffic. 9200 is the REST API, which Kibana will need, 9300 is the inter-node communication (re: transport), and 9600 is specific to ODfE and is for the performance analyzer plugin.

Apply and run:

% kubectl apply -f 03-ElasticsearchService.yml
% kubectl describe svc/elasticsearch --namespace kube-logging
Name:              elasticsearch
Namespace:         kube-logging
Labels:            app=elasticsearch
Annotations:       kubectl.kubernetes.io/last-applied-configuration:
                     {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"app":"elasticsearch"},"name":"elasticsearch","namespace":"kube...
Selector:          app=elasticsearch
Type:              ClusterIP
IP:                None
Port:              rest  9200/TCP
TargetPort:        9200/TCP
Endpoints:         10.200.27.66:9200,10.200.57.66:9200,10.200.67.84:9200
Port:              inter-node  9300/TCP
TargetPort:        9300/TCP
Endpoints:         10.200.27.66:9300,10.200.57.66:9300,10.200.67.84:9300
Port:              performance  9600/TCP
TargetPort:        9600/TCP
Endpoints:         10.200.27.66:9600,10.200.57.66:9600,10.200.67.84:9600
Session Affinity:  None
Events:            <none>

You won't see any endpoints right now because you haven't deployed the StatefulSet.

Deployment

If you thought the configuration was complicated, you are in for a world of hurt. In all fairness, this is the hardest part of the whole configuration, and once you have this, the rest is easy in comparison.

Create a file called ElasticsearchStatefulSet.yml and fill it with these contents:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: es-cluster
  namespace: kube-logging
spec:
  serviceName: elasticsearch
  replicas: 3
  selector:
    matchLabels:
      app: elasticsearch
  template:
    metadata:
      labels:
        app: elasticsearch
    spec:
      containers:
        - name: elasticsearch
          image: amazon/opendistro-for-elasticsearch:1.0.1
          resources:
            limits:
              cpu: 1000m
            requests:
              cpu: 100m
          ports:
            - containerPort: 9200
              name: rest
              protocol: TCP
            - containerPort: 9300
              name: inter-node
              protocol: TCP
            - containerPort: 9600
              name: performance
              protocol: TCP
          volumeMounts:
            - name: data
              mountPath: /usr/share/elasticsearch/data
            - name: elasticsearch-config
              mountPath: /usr/share/elasticsearch/config/elasticsearch.yml
              subPath: elasticsearch.yml
            - name: elasticsearch-tls-ca
              mountPath: /usr/share/elasticsearch/config/ca.pem
              subPath: ca.pem
              readOnly: true
            - name: elasticsearch-tls-cert
              mountPath: /usr/share/elasticsearch/config/cert.pem
              subPath: tls.crt
              readOnly: true
            - name: elasticsearch-tls-key
              mountPath: /usr/share/elasticsearch/config/key.pem
              subPath: tls.key
              readOnly: true
          env:
            - name: cluster.name
              value: k8s-logs
            - name: node.name
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: discovery.seed_hosts
              value: "es-cluster-0.elasticsearch,es-cluster-1.elasticsearch,es-cluster-2.elasticsearch"
            - name: cluster.initial_master_nodes
              value: "es-cluster-0,es-cluster-1,es-cluster-2"
            - name: ES_JAVA_OPTS
              value: "-Xms512m -Xmx512m"
      volumes:
        - name: elasticsearch-config
          configMap:
            name: elasticsearch-config
            items:
              - key: elasticsearch.yml
                path: elasticsearch.yml
        - name: elasticsearch-tls-ca
          secret:
            secretName: elasticsearch-tls-ca
            items:
              - key: ca.pem
                path: ca.pem
        - name: elasticsearch-tls-cert
          secret:
            secretName: elasticsearch-tls
            items:
              - key: tls.crt
                path: tls.crt
        - name: elasticsearch-tls-key
          secret:
            secretName: elasticsearch-tls
            items:
              - key: tls.key
                path: tls.key
      initContainers:
        - name: fix-permissions
          image: busybox
          command: ["sh", "-c", "chown -R 1000:1000 /usr/share/elasticsearch/data"]
          securityContext:
            privileged: true
          volumeMounts:
            - name: data
              mountPath: /usr/share/elasticsearch/data
        - name: increase-vm-max-map
          image: busybox
          command: ["sysctl", "-w", "vm.max_map_count=262144"]
          securityContext:
            privileged: true
        - name: increase-fd-ulimit
          image: busybox
          command: ["sh", "-c", "ulimit -n 65536"]
          securityContext:
            privileged: true
  volumeClaimTemplates:
    - metadata:
        name: data
        labels:
          app: elasticsearch
      spec:
        accessModes: [ "ReadWriteMany" ]
        storageClassName: nfs-client
        resources:
          requests:
            storage: 25Gi

This is a full and complete configuration for running just ODfE Elasticsearch in k8s. If you only want an Elasticsearch cluster, this is your stopping point. The rest of the things which come after are really just for humans. Obviously, finish this subsection, but then stop.

containers:
- name: elasticsearch
  image: amazon/opendistro-for-elasticsearch:1.0.1
  resources:
    limits:
      cpu: 1000m
    requests:
      cpu: 100m
  ports:
    - containerPort: 9200
      name: rest
      protocol: TCP
    - containerPort: 9300
      name: inter-node
      protocol: TCP
    - containerPort: 9600
      name: performance
      protocol: TCP
  volumeMounts:
    - name: data
      mountPath: /usr/share/elasticsearch/data
    - name: elasticsearch-config
      mountPath: /usr/share/elasticsearch/config/elasticsearch.yml
      subPath: elasticsearch.yml
    - name: elasticsearch-tls-ca
      mountPath: /usr/share/elasticsearch/config/ca.pem
      subPath: ca.pem
      readOnly: true
    - name: elasticsearch-tls-cert
      mountPath: /usr/share/elasticsearch/config/cert.pem
      subPath: tls.crt
      readOnly: true
    - name: elasticsearch-tls-key
      mountPath: /usr/share/elasticsearch/config/key.pem
      subPath: tls.key
      readOnly: true
  env:
    - name: node.name
      valueFrom:
        fieldRef:
          fieldPath: metadata.name
    - name: ES_JAVA_OPTS
      value: "-Xms512m -Xmx512m"

What you see above is the core container configuration. The rest is a bunch of modularity and sustainability thrown on top for humans, this is the meat of the meal. We're setting resource limits (so Elasticsearch doesn't eat the k8s cluster), exposing the necessary ports, mounting a data volume for index storage, mounting some files, and then setting the node name and some Java options. I'm also using the upstream container images from Amazon, amazon/opendistro-for-elasticsearch, since I'm very lazy. I'm using the 1.0.1 tag because at the time of publishing, that is the latest version.

The files we're loading in the volumeMounts:

  • elasticsearch.yml
  • ca.pem
  • cert.pem
  • key.pem

Those files are the ConfigMap files we created not too long ago. We are mounting them at /usr/share/elasticsearch/config/{file} because the default Elasticsearch configuration path is /usr/share/elasticsearch/config. We need subpaths because we are loading files and not paths, so k8s will mount things properly. Take note they are mounted as readOnly: true as k8s will not mount secrets that are not in readOnly: true mode.

initContainers:
  - name: fix-permissions
    image: busybox
    command: ["sh", "-c", "chown -R 1000:1000 /usr/share/elasticsearch/data"]
    securityContext:
      privileged: true
    volumeMounts:
      - name: data
        mountPath: /usr/share/elasticsearch/data
  - name: increase-vm-max-map
    image: busybox
    command: ["sysctl", "-w", "vm.max_map_count=262144"]
    securityContext:
      privileged: true
  - name: increase-fd-ulimit
    image: busybox
    command: ["sh", "-c", "ulimit -n 65536"]
    securityContext:
      privileged: true

It's important to note that Elasticsearch does have some requirements. Firstly, we're ensuring that our NFS store is reading and writing files as 1000:1000 as that is what Elasticsearch will write as. Secondly, we need to increase the virtual memory max mapping to 262144. From the Elasticsearch docs:

Elasticsearch also requires the ability to create many memory-mapped areas. The maximum map count check checks that the kernel allows a process to have at least 262,144 memory-mapped areas and is enforced on Linux only. To pass the maximum map count check, you must configure vm.max_map_count via sysctl to be at least 262144.

Thirdly, we need to ensure the file descriptor limit is maxed out as Elasticsearch will be creating a lot of indices with a lot of replicas (linear, but still a ton).

Now, the easiest and most repeatable way to ensure all three requirements is through initContainers as they are guaranteed to run no matter where you take this configuration.

volumeClaimTemplates:
- metadata:
    name: data
    labels:
      app: elasticsearch
  spec:
    accessModes: [ "ReadWriteMany" ]
    storageClassName: nfs-client
    resources:
      requests:
        storage: 25Gi

Finally we have our persistent volume claim template. This is the core template for each of the Elasticsearch nodes' persistent disks. As I mentioned earlier, I am using the NFS provisioner and have it labeled nfs-client, you can replace that with whatever storage provisioner you want.

Now we can go through and deploy it:

% kubectl apply -f ElasticsearchStatefulSet.yml
% kubectl describe sts/es-cluster --namespace kube-logging
Name:               es-cluster
Namespace:          kube-logging
CreationTimestamp:  Mon, 15 Jul 2019 00:50:51 -0600
Selector:           app=elasticsearch
Labels:             <none>
Annotations:        kubectl.kubernetes.io/last-applied-configuration:
                      {"apiVersion":"apps/v1","kind":"StatefulSet","metadata":{"annotations":{},"name":"es-cluster","namespace":"kube-logging"},"spec":{"replica..."}
Replicas:           3 desired | 3 total
Update Strategy:    RollingUpdate
  Partition:        824643162456
Pods Status:        3 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
  Labels:  app=elasticsearch
  Init Containers:
   fix-permissions:
    Image:      busybox
    Port:       <none>
    Host Port:  <none>
    Command:
      sh
      -c
      chown -R 1000:1000 /usr/share/elasticsearch/data
    Environment:  <none>
    Mounts:
      /usr/share/elasticsearch/data from data (rw)
   increase-vm-max-map:
    Image:      busybox
    Port:       <none>
    Host Port:  <none>
    Command:
      sysctl
      -w
      vm.max_map_count=262144
    Environment:  <none>
    Mounts:       <none>
   increase-fd-ulimit:
    Image:      busybox
    Port:       <none>
    Host Port:  <none>
    Command:
      sh
      -c
      ulimit -n 65536
    Environment:  <none>
    Mounts:       <none>
  Containers:
   elasticsearch:
    Image:       amazon/opendistro-for-elasticsearch:1.0.1
    Ports:       9200/TCP, 9300/TCP, 9600/TCP
    Host Ports:  0/TCP, 0/TCP, 0/TCP
    Limits:
      cpu:  1
    Requests:
      cpu:  100m
    Environment:
      cluster.name:                  k8s-logs
      node.name:                      (v1:metadata.name)
      discovery.seed_hosts:          es-cluster-0.elasticsearch,es-cluster-1.elasticsearch,es-cluster-2.elasticsearch
      cluster.initial_master_nodes:  es-cluster-0,es-cluster-1,es-cluster-2
      ES_JAVA_OPTS:                  -Xms512m -Xmx512m
    Mounts:
      /usr/share/elasticsearch/config/ca.pem from elasticsearch-tls-ca (ro,path="ca.pem")
      /usr/share/elasticsearch/config/cert.pem from elasticsearch-tls-cert (ro,path="tls.crt")
      /usr/share/elasticsearch/config/elasticsearch.yml from elasticsearch-config (rw,path="elasticsearch.yml")
      /usr/share/elasticsearch/config/key.pem from elasticsearch-tls-key (ro,path="tls.key")
      /usr/share/elasticsearch/data from data (rw)
  Volumes:
   elasticsearch-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      elasticsearch-config
    Optional:  false
   elasticsearch-tls-ca:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  elasticsearch-tls-ca
    Optional:    false
   elasticsearch-tls-cert:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  elasticsearch-tls
    Optional:    false
   elasticsearch-tls-key:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  elasticsearch-tls
    Optional:    false
Volume Claims:
  Name:          data
  StorageClass:  nfs-client
  Labels:        app=elasticsearch
  Annotations:   <none>
  Capacity:      25Gi
  Access Modes:  [ReadWriteMany]
Events:          <none>

Kibana

Now that we have Elasticsearch up and running, we need to ensure we can access the cluster before we start sending data to it. Just like Elasticsearch, we have three parts: ConfigMap, a Service, and instead of a StatefulSet, we have a Deployment.

Configuration

For all of the same reasons we split out the configs in Elasticsearch, we're doing the same in Kibana. There's a ConfigMap which holds the Kibana-specific configuration as we're not custom building a container.

Create a file called 04-KibanaConfigMap.yml and populate it with this:

apiVersion: v1
kind: ConfigMap
metadata:
  name: kibana-config
  namespace: kube-logging
data:
  kibana.yml: |
    server.name: kibana
    server.host: 0.0.0.0
    elasticsearch.hosts: https://elasticsearch:9200
    elasticsearch.ssl.verificationMode: full
    elasticsearch.ssl.certificateAuthorities: ["/usr/share/kibana/config/ca.pem"]
    elasticsearch.username: kibanaserver
    elasticsearch.password: kibanaserver
    elasticsearch.requestHeadersWhitelist: ["securitytenant","Authorization"]
    opendistro_security.multitenancy.enabled: true
    opendistro_security.multitenancy.tenants.preferred: ["Private", "Global"]

Diving into the kibana.yml configuration, we can see some core configurations necessary for a successful deployment. The server tree is really just about the Kibana tomcat configuration, ensuring it listens on any address.

elasticsearch.hosts: https://elasticsearch:9200
elasticsearch.ssl.verificationMode: full
elasticsearch.ssl.certificateAuthorities: ["/usr/share/kibana/config/ca.pem"]
elasticsearch.username: kibanaserver
elasticsearch.password: kibanaserver
elasticsearch.requestHeadersWhitelist: ["securitytenant","Authorization"]

Here's where we configure Kibana's configuration to Elasticsearch, including it's approved client credentials, the root certificate, and header configurations. The elasticsearch.ssl.certificateAuthorities configuration key needs the /aperture/root-ca certificate authority that Elasticsearch is using, so we're just going to go ahead and mount it later. Kibana's default configuration path in these containers is /usr/share/kibana/config, so we'll make sure this configuration gets mounted there.

opendistro_security.multitenancy.enabled: true
opendistro_security.multitenancy.tenants.preferred: ["Private", "Global"]

Here we are just enabling multi-tenancy. Configuring it is out of the scope of this specific post (dear god, could you imagine how long this post would be??), but we're enabling it for now, so we can poke around with it later.

Apply!

% kubectl apply -f 04-KibanaConfigMap.yml
% kubectl describe configmaps/kibana-config --namespace kube-logging
Name:         kibana-config
Namespace:    kube-logging
Labels:       <none>
Annotations:  kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"v1","data":{"kibana.yml":"server.name: kibana\nserver.host: 0.0.0.0\nelasticsearch.hosts: https://elasticsearch:9200\nelast..."}

Data
====
kibana.yml:
----
server.name: kibana
server.host: 0.0.0.0
elasticsearch.hosts: https://elasticsearch:9200
elasticsearch.ssl.verificationMode: full
elasticsearch.ssl.certificateAuthorities: ["/usr/share/kibana/config/ca.pem"]
elasticsearch.username: kibanaserver
elasticsearch.password: kibanaserver
elasticsearch.requestHeadersWhitelist: ["securitytenant","Authorization"]
opendistro_security.multitenancy.enabled: true
opendistro_security.multitenancy.tenants.preferred: ["Private", "Global"]

Events:  <none>

Service

Unlike Elasticsearch, we are not going to create a headless Service, but one that can be reached from outside the cluster. It's important to be aware that I am using vSphere sans NSX-T, so I have no k8s Cloud Provider. This means that if I want to play with things on the cluster, I either have to run kubectl port-forward, which I don't, or use node ports and reverse proxies, which I do. So your mileage will vary a bit on this section, do whatever you think is appropriate.

Create 05-KibanaService.yml and populate it:

apiVersion: v1
kind: Service
metadata:
  name: kibana
  namespace: kube-logging
  labels:
    app: kibana
spec:
  ports:
    - port: 5601
      nodePort: 30901
  type: NodePort
  selector:
    app: kibana

This is all pretty straightforward. I'm listening on a given nodePort that I'll reverse proxy. Do what makes you happy!

Apply and inspect:

% kubectl apply -f 05-KibanaService.yml
% kubectl describe svc/kibana --namespace kube-logging
Name:                     kibana
Namespace:                kube-logging
Labels:                   app=kibana
Annotations:              kubectl.kubernetes.io/last-applied-configuration:
                            {"apiVersion":"v1","kind":"Service","metadata":{"annotations":{},"labels":{"app":"kibana"},"name":"kibana","namespace":"kube-logging"},"sp..."}
Selector:                 app=kibana
Type:                     NodePort
IP:                       10.100.200.104
Port:                     <unset>  5601/TCP
TargetPort:               5601/TCP
NodePort:                 <unset>  30901/TCP
Endpoints:                10.200.67.82:5601
Session Affinity:         None
External Traffic Policy:  Cluster
Events:                   <none>

Deployment

Here we are actually using a proper Deployment since Kibana doesn't store any local state that we inherently care about.

Create Kibana.yml and populate:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: kibana
  namespace: kube-logging
  labels:
    app: kibana
spec:
  replicas: 1
  selector:
    matchLabels:
      app: kibana
  template:
    metadata:
      labels:
        app: kibana
    spec:
      # todo: readiness/liveness checks.
      containers:
        - name: kibana
          image: amazon/opendistro-for-elasticsearch-kibana:1.0.1
          resources:
            limits:
              cpu: 1000m
            requests:
              cpu: 100m
          ports:
            - containerPort: 5601
          volumeMounts:
            - name: elasticsearch-tls-ca
              mountPath: /usr/share/kibana/config/ca.pem
              subPath: ca.pem
              readOnly: true
            - name: kibana-config
              mountPath: /usr/share/kibana/config/kibana.yml
              subPath: kibana.yml
              readOnly: true
      volumes:
        - name: elasticsearch-tls-ca
          secret:
            secretName: elasticsearch-tls-ca
            items:
              - key: ca.pem
                path: ca.pem
        - name: kibana-config
          configMap:
            name: kibana-config
            items:
              - key: kibana.yml
                path: kibana.yml

I'm not going to go into the minute details of this Deployment spec as it's really just a web server, and the volumes it mounts are just the ConfigMap and Secret we created earlier. You can re-read the Elasticsearch section on how these work if you really to. You can, however, see my todo note about creating a liveness/readiness probe. It theoretically wouldn't be hard, but on my hardware Kibana takes 4-5 minutes to boot (no joke), so that'd have to be set egregiously high.

Apply and inspect!

% kubectl apply -f Kibana.yml
% kubectl describe deployment/kibana --namespace kube-logging
Name:                   kibana
Namespace:              kube-logging
CreationTimestamp:      Mon, 15 Jul 2019 00:27:41 -0600
Labels:                 app=kibana
Annotations:            deployment.kubernetes.io/revision: 1
                        kubectl.kubernetes.io/last-applied-configuration:
                          {"apiVersion":"apps/v1","kind":"Deployment","metadata":{"annotations":{},"labels":{"app":"kibana"},"name":"kibana","namespace":"kube-loggi..."}
Selector:               app=kibana
Replicas:               1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType:           RollingUpdate
MinReadySeconds:        0
RollingUpdateStrategy:  25% max unavailable, 25% max surge
Pod Template:
  Labels:  app=kibana
  Containers:
   kibana:
    Image:      amazon/opendistro-for-elasticsearch-kibana:1.0.1
    Port:       5601/TCP
    Host Port:  0/TCP
    Limits:
      cpu:  1
    Requests:
      cpu:        100m
    Environment:  <none>
    Mounts:
      /usr/share/kibana/config/ca.pem from elasticsearch-tls-ca (ro,path="ca.pem")
      /usr/share/kibana/config/kibana.yml from kibana-config (ro,path="kibana.yml")
  Volumes:
   elasticsearch-tls-ca:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  elasticsearch-tls-ca
    Optional:    false
   kibana-config:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      kibana-config
    Optional:  false
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Available      True    MinimumReplicasAvailable
  Progressing    True    NewReplicaSetAvailable
OldReplicaSets:  <none>
NewReplicaSet:   kibana-645fdc5d4 (1/1 replicas created)
Events:          <none>

Now Kibana will be available at whatever external or internal address you've set!

ODfE on the Cloud Foundry Container Runtime!

By default, the Kibana login is admin / admin.

Security Configurations

You'll need to go through and set a default index, you can use the security-auditlog-* pattern for now. The logs index pattern is logstash-*, but that won't really matter yet, so just wait.


Fluent

If we were deploying a standard ELK stack, we'd be using Logstash here. Instead, we are using Fluent, which is a bit more generic than Logstash. It's also in Ruby, but it has a much more sane syntax, and many more plugins. We're going to be using the Fluentd image from Github, so we're also not going to be creating our own image.

Permissions

In order to read from the system logs and locations, we're going to need three permissions from two resources. As k8s is a distributed deployment and we'll want to capture all namespaces, we need to ensure it's a cluster-wide permission.

Create 06-FluentdClusterRole.yml and fill it:

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: fluentd
  labels:
    app: fluentd
rules:
  - apiGroups:
      - ""
    resources:
      - pods
      - namespaces
    verbs:
      - get
      - list
      - watch

In order to limit permissions and ensure auditability, we'll want to use a service account. Let's create 07-FluentdServiceAccount.yml and fill it:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluentd
  namespace: kube-logging
  labels:
    app: fluentd

Then of course, we need to bind it. Create 08-FluentdClusterRoleBinding.yml and populate:

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: fluentd
roleRef:
  kind: ClusterRole
  name: fluentd
  apiGroup: rbac.authorization.k8s.io
subjects:
  - kind: ServiceAccount
    name: fluentd
    namespace: kube-logging

Set the upstream permissions and verify:

% kubectl apply -f 06-FluentdClusterRole.yaml
clusterrole.rbac.authorization.k8s.io/fluentd created

% kubectl describe clusterrole/fluentd --namespace kube-logging
Name:         fluentd
Labels:       app=fluentd
Annotations:  kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"rbac.authorization.k8s.io/v1","kind":"ClusterRole","metadata":{"annotations":{},"labels":{"app":"fluentd"},"name":"fluentd"...}
PolicyRule:
  Resources   Non-Resource URLs  Resource Names  Verbs
  ---------   -----------------  --------------  -----
  namespaces  []                 []              [get list watch]
  pods        []                 []              [get list watch]

% kubectl apply -f 07-FluentdServiceAccount.yaml
serviceaccount/fluentd created

% kubectl describe sa/fluentd --namespace kube-logging
Name:                fluentd
Namespace:           kube-logging
Labels:              app=fluentd
Annotations:         kubectl.kubernetes.io/last-applied-configuration:
                       {"apiVersion":"v1","kind":"ServiceAccount","metadata":{"annotations":{},"labels":{"app":"fluentd"},"name":"fluentd","namespace":"kube-logg..."}
Image pull secrets:  <none>
Mountable secrets:   fluentd-token-wckcw
Tokens:              fluentd-token-wckcw
Events:              <none>

% kubectl apply -f 08-FluentdClusterRoleBinding.yaml
clusterrolebinding.rbac.authorization.k8s.io/fluentd created

% kubectl describe clusterrolebinding/fluentd --namespace kube-logging
Name:         fluentd
Labels:       <none>
Annotations:  kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"rbac.authorization.k8s.io/v1","kind":"ClusterRoleBinding","metadata":{"annotations":{},"name":"fluentd"},"roleRef":{"apiGro..."}
Role:
  Kind:  ClusterRole
  Name:  fluentd
Subjects:
  Kind            Name     Namespace
  ----            ----     ---------
  ServiceAccount  fluentd  kube-logging

Deployment

Now we need to go ahead and deploy the Fluentd daemon. Create a file called FluentdDaemonSet.yml and fill it like so:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd
  namespace: kube-logging
  labels:
    app: fluentd
spec:
  selector:
    matchLabels:
      app: fluentd
  template:
    metadata:
      labels:
        app: fluentd
    spec:
      serviceAccountName: fluentd
      tolerations:
        - key: node-role.kubernetes.io/master
          effect: NoSchedule
      containers:
        - name: fluentd
          image: fluent/fluentd-kubernetes-daemonset:v1.4.2-debian-elasticsearch-1.1
          env:
            - name:  FLUENT_ELASTICSEARCH_HOST
              value: "elasticsearch.kube-logging.svc.cluster.local"
            - name:  FLUENT_ELASTICSEARCH_PORT
              value: "9200"
            - name: FLUENT_ELASTICSEARCH_SCHEME
              value: "https"
            - name: FLUENTD_SYSTEMD_CONF
              value: disable
            - name: FLUENT_ELASTICSEARCH_USER
              value: admin
            - name: FLUENT_ELASTICSEARCH_PASSWORD
              value: admin
            - name: FLUENT_ELASTICSEARCH_SSL_VERSION
              value: TLSv1_2
            # some historical nonsense. https://github.com/fluent/fluentd-kubernetes-daemonset#disable-sed-execution-on-elasticsearch-image
            - name: FLUENT_ELASTICSEARCH_SED_DISABLE
              value: "true"
            - name: FLUENT_ELASTICSEARCH_CUSTOM_HEADERS
              value: '{"WWW-Authenticate": "Basic"}'
            - name: SSL_CERT_FILE
              value: /fluentd/etc/ca.pem
          resources:
            limits:
              memory: 512Mi
            requests:
              cpu: 100m
              memory: 200Mi
          volumeMounts:
            - name: varlog
              mountPath: /var/log
            - name: varlibdockercontainers
              mountPath: /var/vcap/data/docker/docker/containers
              readOnly: true
            - name: elasticsearch-tls-ca
              mountPath: /fluentd/etc/ca.pem
              subPath: ca.pem
              readOnly: true
      terminationGracePeriodSeconds: 30
      volumes:
        - name: varlog
          hostPath:
            path: /var/log
        - name: varlibdockercontainers
          hostPath:
            path: /var/vcap/data/docker/docker/containers
        - name: elasticsearch-tls-ca
          secret:
            secretName: elasticsearch-tls-ca
            items:
              - key: ca.pem
                path: ca.pem

This is pretty straightfoward, but there are some configurations I want to walk through so you know what to pay attention to as while they are easily missed.

env:
    - name:  FLUENT_ELASTICSEARCH_HOST
      value: "elasticsearch.kube-logging.svc.cluster.local"
    - name:  FLUENT_ELASTICSEARCH_PORT
      value: "9200"
    - name: FLUENT_ELASTICSEARCH_SCHEME
      value: "https"
    - name: FLUENTD_SYSTEMD_CONF
      value: disable
    - name: FLUENT_ELASTICSEARCH_USER
      value: admin
    - name: FLUENT_ELASTICSEARCH_PASSWORD
      value: admin
    - name: FLUENT_ELASTICSEARCH_SSL_VERSION
      value: TLSv1_2
    # some historical nonsense. https://github.com/fluent/fluentd-kubernetes-daemonset#disable-sed-execution-on-elasticsearch-image
    - name: FLUENT_ELASTICSEARCH_SED_DISABLE
      value: "true"
    - name: FLUENT_ELASTICSEARCH_CUSTOM_HEADERS
      value: '{"WWW-Authenticate": "Basic"}'
    - name: SSL_CERT_FILE
      value: /fluentd/etc/ca.pem

One thing that is very easy to miss is that while you think all the important environment variables are FLUENT_ELASTICSAERCH_, there's one which is FLUENTD_SYSTEMD_CONF. It's very important that you pay attention to the D ­čśĆor you will get a lot of these messages:

[warn]: #0 [in_systemd_bootkube] Systemd::JournalError: No such file or directory retrying in 1s
[warn]: #0 [in_systemd_kubelet] Systemd::JournalError: No such file or directory retrying in 1s
[warn]: #0 [in_systemd_docker] Systemd::JournalError: No such file or directory retrying in 1s

You'll also notice we're setting the Elastcisearch HTTP scheme to https as we need to connect to it via TLS. We're also hardcoding the TLS version to be v1.2 (there's not full support for TLS v1.3 yet), and setting the headers to support basic authentication.

volumes:
  - name: varlog
    hostPath:
      path: /var/log
  - name: varlibdockercontainers
    hostPath:
      path: /var/vcap/data/docker/docker/containers
  - name: elasticsearch-tls-ca
    secret:
      secretName: elasticsearch-tls-ca
      items:
        - key: ca.pem
          path: ca.pem

The volume mounts here are really subtle. The really important one here is varlibdockercontainers, where the original path is /var/log/docker/containers, being the local mount path for Docker's native logging subsystem. However, because we are using kubo, that mount point is actually /var/vcap/data/docker/docker/containers. This is very important to ensure you have because otherwise you won't get any container logs. We're also mounting our /aperture/root-ca certificate authority so we don't get TLS errors when Fluentd connects to Elasticsearch.

Deploy and validate:

% kubectl apply -f FluentdDaemonSet.yaml
daemonset.apps/fluentd unchanged

% kubectl describe ds/fluentd --namespace kube-logging
Name:           fluentd
Selector:       app=fluentd
Node-Selector:  <none>
Labels:         app=fluentd
Annotations:    deprecated.daemonset.template.generation: 1
                kubectl.kubernetes.io/last-applied-configuration:
                  {"apiVersion":"apps/v1","kind":"DaemonSet","metadata":{"annotations":{},"labels":{"app":"fluentd"},"name":"fluentd","namespace":"kube-logg..."}
Desired Number of Nodes Scheduled: 3
Current Number of Nodes Scheduled: 3
Number of Nodes Scheduled with Up-to-date Pods: 3
Number of Nodes Scheduled with Available Pods: 3
Number of Nodes Misscheduled: 0
Pods Status:  3 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
  Labels:           app=fluentd
  Service Account:  fluentd
  Containers:
   fluentd:
    Image:      fluent/fluentd-kubernetes-daemonset:v1.4.2-debian-elasticsearch-1.1
    Port:       <none>
    Host Port:  <none>
    Limits:
      memory:  512Mi
    Requests:
      cpu:     100m
      memory:  200Mi
    Environment:
      FLUENT_ELASTICSEARCH_HOST:            elasticsearch.kube-logging.svc.cluster.local
      FLUENT_ELASTICSEARCH_PORT:            9200
      FLUENT_ELASTICSEARCH_SCHEME:          https
      FLUENTD_SYSTEMD_CONF:                 disable
      FLUENT_ELASTICSEARCH_USER:            admin
      FLUENT_ELASTICSEARCH_PASSWORD:        admin
      FLUENT_ELASTICSEARCH_SSL_VERSION:     TLSv1_2
      FLUENT_ELASTICSEARCH_SED_DISABLE:     true
      FLUENT_ELASTICSEARCH_CUSTOM_HEADERS:  {"WWW-Authenticate": "Basic"}
      SSL_CERT_FILE:                        /fluentd/etc/ca.pem
    Mounts:
      /fluentd/etc/ca.pem from elasticsearch-tls-ca (ro,path="ca.pem")
      /var/log from varlog (rw)
      /var/vcap/data/docker/docker/containers from varlibdockercontainers (ro)
  Volumes:
   varlog:
    Type:          HostPath (bare host directory volume)
    Path:          /var/log
    HostPathType:
   varlibdockercontainers:
    Type:          HostPath (bare host directory volume)
    Path:          /var/vcap/data/docker/docker/containers
    HostPathType:
   elasticsearch-tls-ca:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  elasticsearch-tls-ca
    Optional:    false
Events:          <none>

Wait a minute for things to populate, then you'll have container logs! In order to view them, you need to go through and create an index pattern. There will be a new index pattern popping up called logstash-{date}, so just use logstash-*.

Logstash Index Pattern

Give it a short while to populate (assuming you have things already running you won't need to wait). Now you have logs!

Example of logs!

Good luck, and happy logging!

Appendix

If you're wondering why I had you create the files with numerical prefixes, it's because once you write them to disk, it's likely you'll want to deploy them to more than one place. Once they are in the same folder, they'll look like this:

% ls -1
01-LoggingNamespace.yaml
02-ElasticsearchConfigMap.yaml
03-ElasticsearchService.yaml
04-KibanaConfigMap.yaml
05-KibanaService.yaml
06-FluentdClusterRole.yaml
07-FluentdServiceAccount.yaml
08-FluentdClusterRoleBinding.yaml
ElasticsearchStatefulSet.yaml
FluentdDaemonSet.yaml
Kibana.yaml

The reason this is useful is because when you upload this to git (after cleaning out the Fluentd Elasticsearch credentials, you'll definitely want to just run kubectl apply -f .. Because kubectl just reads the directory alphabetically, we want to ensure the k8s dependency tree is populated in the proper order.

Since I promised I would share my kubo manifest, you can find it here.

I'm definitely human, and there may be varying mistakes here and there, so please let me know if this doesn't work for you!