Do big data algorithms treat people differently based on characteristics like race, religion, and gender? Cathy O'Neil in her new book Weapons of Math Destruction and Frank Pasquale in The Black Box Society both look closely and critically at concerns over discrimination, the challenges of knowing if algorithms are treating people unfairly and the role of public policy in addressing these questions.
Tech leaders must take seriously the debate over data usage -- both because discrimination in any form has to be addressed, and because a failure to do so could lead to misguided measures such as mandated disclosure of algorithmic source code.
What’s not in question is that the benefits of the latest computational tools are all around us. Machine learning helps doctors diagnose cancer, speech recognition software simplifies our everyday interactions and helps those with disabilities, educational software improves learning and prepares children for the challenges of a global economy, new analytics and data sources are extending credit to previously excluded groups. And autonomous cars promise to reduce accidents by 90 percent.
Jason Furman, the Chairman of the Council of Economic Advisors, got it right when he said in a recent speech that his biggest worry about artificial intelligence is that we do not have enough of it.
Of course, any technology, new or old, can further illegal or harmful activities, and the latest computational tools are no exception. But, in the same regard, there is no exception for big data analysis in the existing laws that protect consumers and citizens from harm and discrimination.
The Fair Credit Reporting Act protects the public against the use of inaccurate or incomplete information in decisions regarding credit, employment, and insurance. While passed in the 1970s, this law has been effectively applied to business ventures that use advanced techniques of data analysis, including the scraping of personal data from social media to create profiles of people applying for jobs.
Further, no enterprise can legally use computational techniques to evade statutory prohibitions against discrimination on the basis of race, color, religion, gender and national origin in employment, credit, and housing. In a 2014 report on big data, the Obama Administration emphasized this point and told regulatory to agencies “identify practices and outcomes facilitated by big data analytics that have a discriminatory impact on protected classes, and develop a plan for investigating and resolving violations....”
Even with these legal protections, there is a move to force greater public scrutiny -- including a call for public disclosure of all source code used in decision-making algorithms. Full algorithmic transparency would be harmful. It would reveal selection criteria in such areas as tax audits and terrorist screening that must be kept opaque to prevent people from gaming the system. And by allowing business competitors to use a company’s proprietary algorithms, it would reduce incentives to create better algorithms.
Moreover, it won’t actually contribute to the responsible use of algorithms. Source code is only understandable by experts, and even for them it is hard to say definitively what a program will do based solely on the source code. This is especially true for many of today’s programs that update themselves frequently as they use new data.
To respond to public concern about algorithmic fairness, businesses, government, academics and public interest groups need to come together to establish a clear operational framework for responsible use of big data analytics. Current rules already require some validation of the predictive accuracy of statistical models used in credit, housing, and employment. But technology industry leaders can and should do more.
FTC Commissioner McSweeny has the right idea, with her call for a framework of “responsibility by design.” This would incorporate fairness into algorithms by testing them -- at the development stage -- for potential bias. This should be supplemented by audits after the fact to ensure that algorithms are not only properly designed, but properly operated.
Important in this cooperative stakeholder effort would be a decision about which areas of economic life to include -- starting with sensitive areas like credit, housing and employment. The framework would also need to address the proper roles of the public, developers and users of algorithms, regulators, independent researchers, and subject matter experts, including ethics experts.
It is important to begin to develop this framework now, and to ensure the uses of the new technology are, and are perceived to be, fair to all. The public must be confident in the fairness of algorithms, or a backlash will threaten their very real and substantial benefits.
This article is published as part of the IDG Contributor Network. Want to Join?