Big Data Analysis & Machine Learning using Wendelin
Wendelin is an open-source, 100% Python-based platform for
data ingestion, storage and analysis. Originating from an ongoing EU
research project to develop a Big Data Analysis and Machine Learning stack
"Made in France", Wendelin integrates popular libraries such
as Jupyter, SciPy,
Pandas, matplotlib
or Scikit.Learn to offer a range of tools
for analysing data without having to recompile and beyond the limitations of
available memory.
Wendelin Analytics for the financial industry
The Wendelin stack is 100% open source allowing to setup autonomous cloud-based
or local infrastructures as well as applications. The underlying architecture
uses NEO distributed storage for
persistency and parallel computing, while the out-of-core feature developed for
Wendelin enables computations independent of the memory available in a cluster.
Possible usage scenarios for Wendelin include timeseries analysis for searching
fraudulent patterns in transactions, machine-learning based prediction of
human (cross selling) or machine behaviors (ATM failures) or simply intrusion
detection through analysis of activity logs.
Wendelin also offers the ability to collect data in real time from multiple
sources (ATM, websites, ecommerce, customers) through a single data collector
(Treasure Data's FluentD). This aggregated
data could then be structured using machine learning tools for further analysis,
as well as intelligent or correlation searches.
Wendelin Stack
Wendelin Core
NEO
ERP5
SlapOS
Wendelin Core: Analytics
Wendelin Core allows processing of ingested data using various
machine learning and analysis tools for extraction of metadata, normalization
and structuring. Scikit.learn is used as core library as
it shares the same low level representation of data with NEO.
NEO: Data Archive
Wendelin stores raw data in local or private cloud database powered by SlapOS
and NEO, whose scalable, NoSQL storage engine can leverage a redundant array
of inexpensive servers to achieve virtually infinite data processing capacity.
ERP5: PaaS
ERP5 is the architecture used to run processes related to storage,
acccess and computation of data. Being full-featured, ERP5 could also
be used for creation and hosting of analytics applications (internal
or client facing) optionally also including business processes.
SlapOS: Deployment
The Wendelin stack is deployed and automatically managed via SlapOS,
making it easy to scale the underlying cluster while also providing
resiliency.
Success Cases & Services
Wendelin components are bank security compliant and have been deployed
successfully in the financial sector. References include ERP5, SlapOS and NEO
managing the West African monetary system or the
first French crowd equity platform. Wendelin itself is currently being
deployed in pilots related to the ongoing research project. Services
provided by Nexedi cover data-collection, analysis and visualization as well as consulting
on single topics or implementation of full-fledged Big Data applications on
the Wendelin stack.
${legalese}
${document_id}