Building Payment Classification Models from Rules and Crowdsourced Labels: A Case Study
Year:
2018Published in:
Advanced Information Systems Engineering WorkshopsThe ability to classify customer-to-business payments enables retail financial institutions to better understand their customers’ expenditure patterns and to customize their offerings accordingly. However, payment classification is a difficult problem because of the large and evolving set of businesses and the fact that each business may offer multiple types of products, e.g. a business may sell both food and electronics. Two major approaches to payment classification are rule-based classification and machine learning-based classification on transactions labeled by the customers themselves (a form of crowdsourcing). The rules-based approach is not scalable as it requires rules to be maintained for every business and type of transaction. The crowdsourcing approach leads to inconsistencies and is difficult to bootstrap since it requires a large number of customers to manually label their transactions for an extended period of time. This paper presents a case study at a financial institution in which a hybrid approach is employed. A set of rules is used to bootstrap a financial planner that allowed customers to view their transactions classified with respect to 66 categories, and to add labels to unclassified transactions or to re-label transactions. The crowdsourced labels, together with the initial rule set, are then used to train a machine learning model. We evaluated our model on real anonymised dataset, provided by the financial institution which consists of wire transfers and card payments. In particular, for the wire transfer dataset, the hybrid approach increased the coverage of the rule-based system from 76.4% to 87.4% while replicating the crowdsourced labels with a mean AUC of 0.92, despite inconsistencies between crowdsourced labels.
Related by author
11 publications found
Differentiable Characteristics Of Telegram Mediums During Protests In Belarus 2020
Publisher: Social Network Analysis and Mining
Authors: Tymofii Brik, Ivan Slobozhan, Rajesh Sharma
Applying The CRISP‑DM Data Mining Process In The Financial Services Industry: Elicitation Of Adaptation Requirements
Publisher: Data & Knowledge Engineering
Authors: Veronika Plotnikova, Fredrik Milani, Marlon Dumas
Adaptations Of Data Mining Methodologies: A Systematic Literature Review
Publisher: University of Tartu
Authors: Veronika Plotnikova, Fredrik Milani, Marlon Dumas
Towards a Data Mining Methodology for the Banking Domain
Publisher: University of Tartu
Authors: Veronika Plotnikova
Adaptations Of Data Mining Methodologies: A Systematic Literature Review
Publisher: University of Tartu
Authors: Veronika Plotnikova, Fredrik Milani, Marlon Dumas
Longitudinal Change In Language Behaviour During Protests: A Case Study Of Euromaidan In Ukraine
Publisher: Social Network Analysis and Mining
Authors: Tymofii Brik, Ivan Slobozhan, Rajesh Sharma
Do Facial Trait Correlates With Roll Call Voting In Parliament? Using Fwhr To Study Performance In Politics
Publisher: arxiv
Authors: Tymofii Brik, Rahul Goel, Rajesh Sharma
Designing A Data Mining Process For The Financial Services Domain
Publisher: Journal of Business Analytics
Authors: Veronika Plotnikova, Marlon Dumas, Alexander Nolte, Fredrik Milani
Data Mining Methodologies in the Banking Domain: A Systematic Literature Review
Publisher: Perspectives in Business Informatics Research
Authors: Veronika Plotnikova, Fredrik Milani, Marlon Dumas
Systematic Literature Review Protocol: Adaptations of Data Mining Methodologies
Publisher: University of Tartu
Authors: Veronika Plotnikova, Fredrik Milani, Marlon Dumas