Framework for Identification and Prevention of Direct and Indirect Discrimination using Data mining

Mr. A.I.Sheikh, Ms.Monika Kohale, Ms.AishwaryaWadurkar, Ms. Ashanka Bute, Ms. Dhanashri Baramkar, Ms. Aishwarya Ainchwar

doi:10.17762/ijritcc.v5i3.238

PDF

Published: Mar 31, 2017

DOI: https://doi.org/10.17762/ijritcc.v5i3.238

Mr. A.I.Sheikh, Ms.Monika Kohale, Ms.AishwaryaWadurkar, Ms. Ashanka Bute, Ms. Dhanashri Baramkar, Ms. Aishwarya Ainchwar

Abstract

Extraction of useful and important information from huge collection of data is known as data mining. Negative social perception about data mining is also there, among which potential privacy invasion and potential discrimination are there. Discrimination involves unequally or unfairly treating people on the basis of their belongings to a specific group. Automated data collection and data mining techniques like classification rule mining have made easier to make automated decisions, like loan granting/denial, insurance premium computation, etc. If the training data sets are biased in what regards discriminatory (sensitive) attributes like age, gender, race, religion, etc., discriminatory decisions may ensue. For this reason, antidiscrimination techniques including discrimination discovery, identification and prevention have been introduced in data mining. Discrimination may of two types, either direct or indirect. Direct discrimination is the one where decisions are taken on basis of sensitive attributes. Indirect discrimination is the one where decisions are made based on non-sensitive attributes which are strongly correlated with biased sensitive ones. In this paper, we are dealing with discrimination prevention in data mining and propose new methods applicable for direct or indirect discrimination prevention individually or both at the same time. We discuss how to clean training data sets and transformed data sets in such a way that direct and/or indirect discriminatory decision rules are converted to legitimate (non-discriminatory) classification rules. We also propose new measures and metrics to analyse the utility of the proposed approaches and we compare these approaches.

How to Cite

, M. A. M. K. M. M. A. B. M. D. B. M. A. A. (2017). Framework for Identification and Prevention of Direct and Indirect Discrimination using Data mining. International Journal on Recent and Innovation Trends in Computing and Communication, 5(3), 45–48. https://doi.org/10.17762/ijritcc.v5i3.238