Netflix Atlas for monitoring operational intelligence
Atlas was developed by Netflix to manage dimensional time series data for near real-time operational insight. Atlas features in-memory data storage, allowing it to gather and report very large numbers of metrics, very quickly.
Atlas captures operational intelligence. Whereas business intelligence is data gathered for analyzing trends over time, operational intelligence provides a picture of what is currently happening within a system.
Main goals for Atlas were to build a system that provided:
- A Common API – To have flexibility for backend implementations, and provide merged views across backends,there is need for a query layer that can be hierarchically composed.Sample netflix setup :
- Scale – To handle large quantity of data (close 2 million metrics) and can scale with the hardware to analyze and store it.
- Dimensionality – To support complex regular expressions to slice and dice the data based on the dimensions.
Also Atlas supports Stack Language for complex data queries in a URL-friendly format. It is loosely based on the RPN expressions supported by Tobias Oetiker‘s rrdtool. The following is an example of a stack language expression:
This example pushes two strings nf.cluster and discovery onto the stack and then executes the command :eq. The equal command pops two strings from the stack and pushes a query object onto the stack. The behavior can be described by the stack effect String:key String:value – Query. We then push a list of tag keys to the stack and execute the command :by to group the results.
More on Stack Language : https://github.com/Netflix/atlas/wiki/Stack-Language
Quick start : https://github.com/Netflix/atlas/wiki/Getting-Started