Category: DevOps

Storing Secrets for Teams

Storing secrets among teams is important. There is a great blog on managing secrets and the things you shouldn’t do which I will highlight here:

  • Using the same password for all the things!
  • Using a shared excel file
  • Emailing passwords around
  • Using Chat
  • Using Git repos
  • Custom secrets management built in house

You can view the options the author gives about different ways to store secrets

Measuring the Stored Secret Tools

An important way of choosing which tool to use for your team is actually using it. You will get a feel for the experience and whether it feels robust. I also think there should be a list of predefined objective measures or requirements from the tool. Combining the tools that meet the requirements and that also feel good will guide our choice.

In our case we have the following requirements and measures:

  • Support LDAP
  • Self-hosted, open source and not too expensive
  • Level of good security practice
  • Audit trail
  • Role organsation and management
  • Types of secrets it can store
  • User Experience
  • API existance, level of API documentation and integration
  • documentation of system
  • Nice to haves: CLI tool, one time password link

The Tools

Tools I found for stored secrets but with much more (cmdb):

Tools I found specifically for stored secrets for teams:

Addons to tools I found:

LDAP/SSH tools:




  • Support LDAP: Yes
  • Self-hosted, open source and not too expensive:
  • Level of good security practice
  • Audit trail
  • Role organsation and management
  • Types of secrets it can store
  • User Experience
  • API existance, level of API documentation and integration
  • documentation of system
  • Nice to haves:
    • dynamic secrets – time-limited access to other systems
    • multiple authentication methods – can have multiple LDAP auth configs


A system for managing and distributing secrets. Built with java.

I tried with the docker and maven ways and no success.  Just errors:

[ERROR] Failed to execute goal org.codehaus.mojo:sql-maven-plugin:1.5:execute (default) on project keywhiz-model: Access denied for user 'root'@'localhost' (using password: NO) -> [Help 1]




Systems Password Manager. Built in PHP.

I used docker, importatn during setup to set server/host to: syspass-db

With the docker setup I enabled ldap and after auth, it goes stright to http://localhost:32770/undefined with a 404. Something to do with this open issue.

I’m also not a fan of material design which is already old and has never felt clean to me.


Keycloak is not a password manager. It is open source identity and access management – allowing for single sign on.

Features include:

  • User Registration
  • Social Login
  • Single sign on for all applications in the same realm
  • 2 factor auth
  • LDAP integration
  • Kerberos broker
  • multitenancy and per realm customisable skin

I used the docker container for this appliation and everything just worked and it was clean.

It feels like it is the gold standard in IAM and single sign on.

I’ve haven’t used IAM or SSO yet, but this is a high quality product/project just based on the fact that the docker container just worked.


Open source password management solutions

To use it for organisations/teams you need the enterprise edition. I didn’t try this one.


Password Manager. Psono client is JS, server is python.

  • Support LDAP: No, unless you pay.
  • Self-hosted, open source and not too expensive: Self hosted. Opensource, start paying $2 per user per month after 10 users.
  • Level of good security practice: ?
  • Audit trail: No, unless you pay.
  • Role organsation and management: Decent
  • Types of secrets it can store: Password, Note, GPG Key, Bookmark, File (Couldn’t
  • User Experience: Looks good, not too complicated
  • API existance, level of API documentation and integration: ?
  • documentation of system: Has documentation, not sphinx docs. It’s own shitty docs.
  • Nice to haves:
    • Callbacks – hits a url when a password is changed

Overall it is missing the mark a bit…


A PHP based collaborative passwords manager

  • Support LDAP: Yes
  • Self-hosted, open source and not too expensive: Yes, open source. No enterprise version.
  • Level of good security practice: ?
  • Audit trail: Decent – Can view connections, passwords access and by who
  • Role organsation and management: Fine grained
  • Types of secrets it can store: Restricted to passwords
  • User Experience: Decent, a bit clunky – it is php
  • API existance, level of API documentation and integration: Yes, but not great
  • Documentation of system: Good, sphinx readthedocs
  • Nice to haves:
    • Password expiration
    • Import/Export
    • One time view password link
    • Web browser extension


Couldn’t get the docker install to work…

Lyft Confidant

Your secret keeper. Written in python and JS.

  • Support LDAP: No. But has KMS, google and SAML – whatever that means.
  • Self-hosted, open source and not too expensive: Self hosted, open source. No enterprise edition.
  • Level of good security practice: ?
  • Audit trail: Yes, looks pretty basic though
  • Role organsation and management: ?
  • Types of secrets it can store: Seems you can store anything
  • User Experience: Ok – not very intuitive or guiding
  • API existance, level of API documentation and integration: API exists, it is badly documented
  • documentation of system: Home made docs, basics skipped.
  • Nice to haves: CLI tool, one time password link
    • Works closely with AWS IAM
    • Blind credentials



This is a python command line utility that can get ssh keys from your ldap server and add them to your server’s authorized keys.

Documentation on ssh-ldap-pubkey

This tool allowed me to view the SSH public keys for people on ldap.

It was however not automatic…I had to install the package globally. Then I had to manually configure /etc/ldap.conf with the ldap server details.

On a fresh ubuntu you might also have to install:

sudo apt-get install libsasl2-dev python-dev libldap-dev libssl-dev

Another annoying thing is I had to install a specific version 1.3.0 as 1.3.1 was borked.
Then to configure OpenSSH server to fetch users’ authorized keys  you had to add 2 lines to /etc/ssh/sshd_config:

AuthorizedKeysCommand /usr/bin/ssh-ldap-pubkey-wrapper
AuthorizedKeysCommandUser nobody

Still didn’t manage to get it to work

What happens in these cases:

  1. A new user tries to ssh to your server with their ldap username?
  2. A user whose ssh key has been added to your server is removed or taken out of the group on ldap?


A video you can watch on the topic:



Introduction to Alerta: Open Source Aggregated Alerts

There are a number of platforms available these days to assist operations in terms of dealing with alerts. Namely Pagerduty, VictorOps and OpsGenie. These are unfortunately pay for tools/

These tools are known as monitoring aggregation

I was looking through the integrations of elastalert and found that there is an integration for, so I checked the website and it seemed to check all the boxed of monitoring aggregation.

I used the docker compose way of setting it up quickly, but if you want to set it up proper then follow the deployment guide.

Update some config:

docker exec -u root -it alerta_web_1 /bin/bash
apt update
apt install vim
# Edit the config in /app/alertad.conf
# Restart the container

Add the housekeeping cron job:

echo "* * * * * root /venv/bin/alerta housekeeping" >/etc/cron.daily/alerta

The default timeout period for an alert is 86400 seconds, or one day.

Check out the alerta plugins

What popular alerting and monitoring tools does integrate with?

Reducing and Learning from Monitoring Alerts in Business Environments

How often is it the case where monitoring alerts and notifications get out of hand in an organisation. The alerts become too many, alert only via a single channel, alert for minor and major severities in the same manor and generally take time off engineers hands for improving and fixing these errors when they constantly have to check these alerts.

Ideally we want alerts to be relevant for things that need to be fixed in a short time frame. Other alerts (non-critical) still may be good but should be reviewed looking back over a longer period. That is how I see it at least…

The key things to get right in my opinion:

  • Relevancy: disregard irrelevant alerts for the present
  • Channels: Sending critical alerts to instant messaging / phone calls and non-critical to email / analytics platform
  • Structured data – alerts should be as structured as possible so it lets you make specific criteria and rules based on them. If the data you receive is garbage text (like an email) then you won’t have a good way of classifying and remediating from them.
  • Let the various departments own their rules / criteria – the people running these systems are the ones that should receive the alerts and manage the channel, severity etc.
  • Machine learning?

A Note on Machine Learning

Naturally we want this all done for us at the click of a button, but it is not that easy. Some people will just shout machine learning or AI will handle it, without the slightest idea of what that entails.

Leveraging machine learning (specifically supervised) I think is the way to go. This way you train the machine to identify critical / relevant messages – with a human. Much like how Google uses Captcha to train robots to identify bus stops and shop entrances or read books. Making structured data from unstructured….

I thought having a user assign a severity level, 1 – 5 based on each alert from each relevant department for a while will help a machine learning algorithm identify important and not important messages.

I did a bit of research but there is already existing solution yet and will probably need some more time or a custom solutions…

What Can you do now to control your alerts?

You need to keep all your alerts first, store them so that you can run analytics on them later for more insight.

Store all the things…


Picture taken from:

You also want the ability to add rules / criteria easily to the alerts coming in, and you want this to be easy enough for non-developers to create and manage them.

If you look at the image above you want to collect all the data you can (so prefer to get it direct from the system, instead of the monitoring systtem controlling the alerts).

So use elasticsearch…bottomline. That is the ELK stack (you could also try using the TICK-L stack). We just need to figure out what we want to interact with it in terms of creating rules, criteria and possible machine learning for it.


The Proof however is in the Tasting

Lets try out the various options…first ensure you have an ELK stack instance you can check this digitalocean tutorial. Ensure you are getting data, try one of the various beats to monitor your system.

I set it up and now I have data:

As you can see Metricbeat probably isn’t as robust and reliable as something like newrelic

I then set up elastalert…


Elastart supports only python2.7 which as we know is going / went out of support in 2020.

It is a bit tricky to set up as well, not super tricky but trick. Creating rules is also not trivial, you need to know the different rule types, the parameters they accept and they need to be tested. All these paramters are configured in yaml which developers seem to think non-technical people or even relatively technical people can use. The truth is yaml is tough and an html form with dropdowns and validation is usually better.

You pretty much have to look at the examples to try and create a rule, also querying the elasticsearch index is important.

I created a test rule after 30 minutes: rules/elasticsearch_memory_high.yaml

name: Metricbeat Elasticsearch Memory High Rule
type: metric_aggregation

es_host: localhost
es_port: 9200

index: metricbeat-*

  hours: 1

metric_agg_key: system.memory.used.pct
metric_agg_type: avg
query_key: beat.hostname
doc_type: doc

  minutes: 5

sync_bucket_interval: true
allow_buffer_time_overlap: true
use_run_every_query_size: true

min_threshold: 0.1
max_threshold: 0.9

- term: memory

- "debug"

Then I test the rule with:

elastalert-test-rule rules/elasticsearch_memory_high.yaml

We can see the matches in the stdout:

INFO:elastalert:Alert for Metricbeat Elasticsearch Memory High Rule, at 2019-05-28T06:40:00Z:
INFO:elastalert:Metricbeat Elasticsearch Memory High Rule

Threshold violation, avg:system.memory.used.pct 0.942444444444 (min: 0.1 max : 0.9) 

@timestamp: 2019-05-28T06:40:00Z
metric_system.memory.used.pct_avg: 0.942444444444
num_hits: 1296
num_matches: 39

INFO:elastalert:Ignoring match for silenced rule Metricbeat Elasticsearch Memory High
INFO:elastalert:Ignoring match for silenced rule Metricbeat Elasticsearch Memory High


And bang! I got it working with telegram.

Just updated my config and ran it with:

python -m elastalert.elastalert --config config.yaml --verbose --rule rules/elasticsearch_memory_high.yaml


telegram-elastalertThe only problem was it was sending this alert every minute.

From a stackoverflow question it seemed the answer was the realert option. We don’t want the alert to be spam (that is why we did this all along)

It is very important to understand the following terms:

bucket_interval, buffer_time, use_run_every_query_size and realert.

The next thing is that you need is to run it as a service via systemd of supervisord, but I will skip this part.

I want to try the other options.

Elastalert Kibana Plugin

After setting up elastalert I realised that creating rules via yaml for non-technical people that struggle to read and apply docs will be impossible. For me it took about an hour plus debugging to figure out a single rule.

So we need a frontend that makes it easy for people to figure out and set rules for systems that they manage.

For that purpose and when still using elastalert we can use the 2 frontends available – Elastalert Kibana Plugin and Praeco. They are both in active development but Praeco is in a pre release phase.

To make use of these frontends, you need an api which apparently vanilla Elastalert from Yelp does not have. So to use these frontends we need to use the Bitsensor Elastalert fork.

Bitsensor elastalert is setup with docker according to their documentation.

I’m no docker expert but managed to sort it out, using the following steps.

Install Docker

The instructions on the docker docs site are good.

Install Bitsensor Elastalert API

The instructions on the bitsensor elastalert site did not work perfectly for me, what I did:

# Ran the recommended way
docker run -d -p 3030:3030 -p 3333:3333 \
>     -v `pwd`/config/elastalert.yaml:/opt/elastalert/config.yaml \
>     -v `pwd`/config/elastalert-test.yaml:/opt/elastalert/config-test.yaml \
>     -v `pwd`/config/config.json:/opt/elastalert-server/config/config.json \
>     -v `pwd`/rules:/opt/elastalert/rules \
>     -v `pwd`/rule_templates:/opt/elastalert/rule_templates \
>     --net="host" \
>     --name elastalert bitsensor/elastalert:latest

# That created the container but it exited prematurely, this did the same thing
docker run -d -p 3030:3030 -p 3333:3333 bitsensor/elastalert:latest

# I noticed it was exiting, so checked the logs and saw that it could not access elasticsearch running on the host (not in the container)
# I needed the container to access the hosts network, which is done with

docker run -d -p 3030:3030 -p 3333:3333 bitsensor/elastalert:latest --network host

It still could not connect to the host using, which I assume means that ip points to the container and not to the host. Debugging this is difficult though – damn I don’t want this to be a docker post. The solution looks to be connect to the hose using host.docker.internal – but there is a caveat. This only works on mac and windows, linux and production whoops.

Ah, I messed it up, jsut run the command they give and you will get a relevant error. I have elasticsearch version 6.8.0, so the latest elastalert will not work as it uses elasticseach python package for 7.0.0.

This was the error and this is the issue on github:

09:15:04.176Z ERROR elastalert-server:
    ProcessController:      return func(*args, params=params, **kwargs)
    TypeError: search() got an unexpected keyword argument 'doc_type'

To fix that you need to build the image yourself with:

make build v=v0.1.39

But that fails with:

step 24/29 : COPY rule_templates/ /opt/elastalert/rule_templates
failed to export image: failed to create image: failed to get layer sha256:66d9b1e58ace9286d78c56116c50f7195e40bfe4603ca82d473543c7fc9b901a: layer does not exist

This was fixed by running the build again. Alas another issue is that the yelp requirements file for that version did not lock the elasticsearch version so I had to juk it with this:

RUN sed -i 's/jira>=1.0.10/jira>=1.0.10,<1.0.15/g' && \
    python install && \
    pip install elasticsearch==6.3.1 && \
    pip install -r requirements.txt

Boom, so first step done. Next step is getting the elastalert kibana frontend plugin working:

To install it you go to: cd /usr/share/kibana/.

and then:

sudo ./bin/kibana-plugin install

Only problem was that the elasticsearch version I was using 6.8.0 was not supported, so I am going to try praeco.


This damn thing doesn’t use docker, it uses docker-compose, a different thing – an orgchestrator of docker.

which can be installed and used following these docker-compose install docs

Pull the repo then do:

docker-compose up

Thing about Praeco is it includes both the bitsensor elastalert API and Praeco…damn I wasted so much time setting it up manually. Also it runs on port 8080 so try not have that port already in use on the host.

I wasn’t able to fix the issue of the docker conatainers not being able to connect to the local host elasticsearch, read the troubleshooting guide for more info. I did get it working on a remote elasticsearch.

Wow this thing is amazing…

Praeco is awesome and interactive and can be very powerful, it is still in development and has one or two bugs but overall excellent.


The rules are limited compared to the yaml based elastalert, but other than that it is an excellent and useful frontend.


Sentinl is more native to kibana, in that it plugs right in much like the existing xpack plugin.

Check you elasticsearch (or kibana) version:

http :9200


sudo /usr/share/kibana/bin/kibana --version

Again the issue of compatibility rears its ugly head, where my version 6.8.0 does not have a respective release for sentinl. Only version 6.6.0 is there. Perhaps a tactic by elasticsearch?

Damn it also looks like sentinl will only be looking to support siren going forward:

Dear all, with the launch of Open Distro we feel the needs of the Kibana community are sufficiently served and as such we are focusing Sentinl on the needs of the Siren platform only. is an investigative intelligence platform

Even attempting to install it fails:

Attempting to transfer from
Transferring 28084765 bytes....................
Transfer complete
Retrieving metadata from plugin archive
Extracting plugin archive
Extraction complete
Plugin installation was unsuccessful due to error "Plugin sentinl [6.6.1] is incompatible with Kibana [6.8.0]"

So that is the end of that…


Ah the PHP and apache crew…looks ok but not something I want to look at right now. Elastalert is my guy.