Best Tools/Open Source Libs

Oryx2 project realization of lambda architecture

Oryx 2 is a realization of the lambda architecture built on Apache Spark and Apache Kafka, but with specialization for real-time large scale machine learning. It is a framework for building applications, but also includes packaged, end-to-end applications for collaborative filtering, classification, regression and clustering.

Lambda Architecture is a useful framework to think about designing big data applications. Nathan Marz designed this generic architecture addressing common requirements for big data based on his experience working on distributed data processing systems at Twitter.

Some of the key requirements in building this architecture include: Fault-tolerance against hardware failures and human errors Support for a variety of use cases that include low latency querying as well as updates Linear scale-out capabilities, meaning that throwing more machines at the problem should help with getting the job done Extensibility so that the system is manageable and can accommodate newer features easily.

Image – Lambda architecture
  1. All data entering the system is dispatched to both the batch layer and the speed layer for processing.
  2. The batch layer has two functions: (i) managing the master dataset (an immutable, append-only set of raw data), and (ii) to pre-compute the batch views.
  3. The serving layer indexes the batch views so that they can be queried in low-latency, ad-hoc way.
  4. The speed layer compensates for the high latency of updates to the serving layer and deals with recent data only.
  5. Any incoming query can be answered by merging results from batch views and real-time views.

Developers can consume Oryx 2 as a framework for building custom applications as well. Following the architecture overview below,If you’re looking to deploy a ready-made, end-to-end application for collaborative filtering, clustering or classification,here are the steps to follow :

Learn about the REST API endpoints here API Endpoint Reference

Image- Oryx2

Oryx2 consists of three tiers

  1. Lambda Tier – Providing base implementation which is not specific to machine learning. It internally contains side-by-side cooperating layers of the lambda architecture:
  • A Batch Layer, which computes a new “result” (think model, but, could be anything) as a function of all historical data, and the previous result. This may be a long-running operation which takes hours, and runs a few times a day for example.
  • A Speed Layer, which produces and publishes incremental model updates from a stream of new data. These updates are intended to happen on the order of seconds.
  • A Serving Layer, which receives models and updates and implements a synchronous API exposing query operations on the result.
  • A data transport layer, which moves data between layers and receives input from external sources
  1. ML Tier Implementation – The ML tier is simply an implementation and specialization of the generic interfaces mentioned above, which implement common ML needs and then expose a different ML-specific interface for applications to fill in.
  2. End-to-end Application Implementation Tier – In addition to being a framework, Oryx 2 contains complete implementations of the batch, speed and serving layer for three machine learning use cases. These are ready to deploy out-of-the-box, or to be used as the basis for a custom application:
  • Collaborative filtering / recommendation based on Alternating Least Squares
  • Clustering based on k-means
  • Classification and regression based on random decision forests

Download link : https://github.com/OryxProject/oryx/releases

Karthik

Allo! My name is Karthik,experienced IT professional.Upnxtblog covers key technology trends that impacts technology industry.This includes Cloud computing,Blockchain,Machine learning & AI,Best mobile apps, Best tools/open source libs etc.,I hope you would love it and you can be sure that each post is fantastic and will be worth your time.

Share
Published by
Karthik

Recent Posts

Navigating Volatility: Investing in Crypto Derivatives and Risk Management Strategies

The cryptocurrency market is famed for its volatility, presenting each opportunity and demanding situations for…

1 week ago

How Game Developers Use AI in Mobile Games in 2024?

Games since time immemorial have been winning at captivating the users and teleporting them onto…

2 weeks ago

The Impact of AI on Software Development

We are living within an innovation curve wherein cutting-edge technologies are making a hustle and…

2 weeks ago

AI Tools for Research Paper Writing: Learn What They Can Do

Whether it’s the healthcare industry or the automobile sector, artificial intelligence has left its impact…

4 weeks ago

Embracing Innovation: 5 Ways AI is Transforming the Landscape in 2024

Facts only- The big Artificial Intelligence push is unraveling in 2024. No, it wasn’t merely…

4 weeks ago

The Startup Guide to Acquiring Exceptional Developers

In the fiercely competitive world of Hire Developers for Startup, success hinges not just on…

2 months ago

This website uses cookies.