Applying machine learning models: AI at work
Artificial Intelligence (AI) has now matured from hype to reality. We are actively and agilely working on concrete applications. Our expert Ertugrul explains how his department provides machine learning models for a wide range of areas.
AI is creating growth opportunities in more and more sectors of the economy. IT professionals like Ertugrul, who works in the field of data analytics at our Frankfurt office, are the driving force behind this, as the number of possible applications in banking is constantly growing – from risk management to business processes. As manager of the Machine Learning Engineering Team, Ertugrul deals with an important subclass of the vast AI field that is behind many of the current innovations – machine learning (ML). This includes the development, training and deployment of self-learning models that perform a variety of tasks much more efficiently than humans. While ML makes work easier, its technical implementation is quite demanding, as Ertugrul and his 15 colleagues can testify. Their solutions are used by our parent company in the Netherlands, in Germany, as well as in other countries.
Diverse roles, one goal
Ertugrul describes his mission as such: “Extracting information and added value from data – that’s more or less what we do.” To achieve this, the work areas are divided along the production process. “There are several important roles in ML. First, there are the customer journey experts (CJE), who structure the business problem and liaise with the technical team. Next, the data scientists step in, developing our models. The task of the ML engineers is then to prepare these models for production. IT engineers are also necessary. They maintain the models in operation.” As an ML engineer, Ertugrul works in the middle of this workflow. His team takes care of applications such as customer interactions (transactions, segmentation), customer dialogue (virtual assistants) and pricing. To illustrate this, Ertugrul tells us about a current project for information extraction.
Application processing 2.0
The challenge to overcome was that customers submit paper documents when they request a loan to purchase property. The processing of these documents should be automated. “We are dealing with semi-structured data. The documents contain the same contents, but in different forms,” explains Ertugrul. One example is salary statements, from which information such as addresses is extracted. Using our solution, processors only have to check whether the information is correct, without having to enter any data themselves. As the solution evolves, even this check will be automated. Ertugrul uses open-source tools such as OCR text capture for this project. The ML model relies on this to identify and read address fields, for example. Ertugrul’s team has to teach the model how to do that, though. “We are the trainers. But for training, we have to provide examples, that is, training data.” The necessary knowledge about these applications is brought in by business subject matter experts – that is, business professionals who know all the processes involved from A to Z. With their input, the training documents can be “labeled” for the model. The model is then optimized in an iterative process with feedback loops.
Models and languages
This raises the question of what the models are actually made of. “The models are created from data and code,” Ertugrul says. “In general, our models tend to be developed in Python. Some of the things we have to pay attention to include version management of the code and data, as well as relevant software principles.” Results, after all, should always be reproducible. Versioning is especially important when adding new features to models or correcting bugs. “We don't spend six months on development and then present the software. Instead, we develop iteratively: we are constantly creating new versions,” Ertugrul explains. “We introduce new features and versions every two or three weeks.” With each project, new templates emerge that follow typical paradigm basics (e.g. object-oriented, test-driven/TDD). Ertugrul prefers Python packages, in which models can also run in a production environment, to alternatives such as writing scripts in Jupyter Notebook.
Agile principles are therefore an integral part of development. This also applies to the broadly diversified team composition. The models are developed with data scientists, engineers and business experts, who also include the customer journey specialists. “In the development phase, knowledge of Java virtual machine (JVM) languages such as Java or Scala is also required. This is necessary because the tech stacks of companies such as ING are based on them,” explains Ertugrul. And the gap between Python and the JVM cosmos needs to be bridged by the engineers. Production deployment uses CI/CD pipelines (Continuous Integration/Continuous Deployment). Such process structures ensure that production-ready code is generated automatically for each feature addition. Knowledge of API design, microservices and containers is also necessary. “Microservices represent a kind of island where we deploy our application in isolation,” adds Ertugrul. Microservices thus act, in a sense, as Python “islands” in the Java “sea” of enterprise IT, with APIs as interfaces.
The skill mix makes the difference
The field of ML is very complex and can be deepened in a variety of dimensions. “Hardly any single IT professional has all this knowledge; someone like that would be a kind of ‘unicorn’. That’s why, when recruiting, we make sure that people are well versed in ML, but also specialized in one or two other topics,” says Ertugrul. One aspect he puts an emphasis on is maintenance (application monitoring, model monitoring), which further requires certain communication skills. “Our people have to be able to explain to the IT engineers what they should be paying attention to,” Ertugrul clarifies. Monitoring is also essential for the behavior of the models, as variables may change over time, e.g. due to “data drift”. For example, the model might suddenly start identifying the wrong data types. Retraining under production conditions then becomes necessary, including A/B testing.
The challenges of implementation quickly become evident when you consider our international setup. “Our real estate finance application is based on different criteria when used in Romania or the Netherlands – partly because of the language,” Ertugrul explains. “We are also developing many other projects. Just to give you an example, for document classification we rely on FastText, an open-source library from Facebook that supports well over 100 languages. Our tool achieves an accuracy of 88.5%.” A Java wrapper including an API was developed for this Python application. New applications can then be trained in the application server. Instructions explain how to label the training documents – doing away with the data scientist step. Another model categorizes account transactions. It is already running in Romania and being rolled out in Luxembourg. ML is also used for “orchestration”, i.e. AI-supported segmentation is used to decide which customers are shown which marketing banners. ML also helps with pricing: interest rates are adjusted depending on expected volume and margin changes. AI thereby determines the sensitivity of individual price points.
Hurdles and solutions
This innovative technology naturally also raises unresolved issues, such as migration to the cloud, which is still largely unrealized in the banking sector. In addition, there is the topic of “explainable AI”. In many ways, AI is still a black box, which in the future should be made easier to evaluate based on comprehensible models, particularly to clarify ethical aspects. Overall, adds Ertugrul, the field is under “permanent transformation” and only targeted investment in personnel and skills will help: “It's not the technology that matters, but the people.” That's why ING has set up an Analytics Academy to support our employees, from all areas of the bank, in acquiring the necessary knowledge.
An opportunity to grow at ING
Analytics and ML – for Ertugrul, these are particularly fascinating fields of IT. He works with us every day to turn findings from research and open-source communities into business reality. With generous training budgets, we support the professional development of our employees and enable agile working – important success factors for ML projects. For us, however, ML is just one challenging IT topic among many.