How to Get Fast Predictions From Real-Time Data
ING is a Data Driven Experimental Enterprise, which is heavily investing in big data, analytics and streaming processing. Streaming analytics (or fast data) is the field of making predictions on real-time data. This technology field is becoming an increasingly popular subject in enterprise organizations. The reason for this is that customers want to have real-time experiences, such as notifications and advise based on their online behavior and other users’ actions.
A typical streaming analytics solution follows a 'pipes and filters' pattern that consists of three main steps: detecting patterns on raw event data (Complex Event Processing), evaluating the outcomes with the aid of business rules and machine learning algorithms, and deciding on the next action.Recently ING has chosen Apache Flink as the primary streaming data processing technology.
In this talk, Bas presents an architecture and technology stack that is in use at ING for streaming analytics solutions that covers many use cases, including actionable insights for personal marketing and fraud detection. Bas discusses a few architecture challenges that will arise when dealing with streaming data, such as latency issues, event time vs server time, and exactly-once processing.
Bas is a programmer, scientist, and IT manager. He has been at ING since 2013 in several technical leadership roles. He has worked on the data lake and streaming data platform, and is now Technology Lead in the global innovation center ING Labs. His academic background is in Artificial Intelligence and Informatics. Bas has a background in software development, design and architecture with a broad technical view from C++ to Prolog to Scala. He occasionally teaches programming courses and is a regular speaker at conferences and informal meetings, where he brings a mixture of market context, his own vision, business cases, architecture and source code in an enthusiastic way towards his audience.