Microsoft Research AI Ethics Checklist Crafts Principles for Designing AI DevOps Processes

The News: Microsoft Research AI ethics checklist is a set of recently published principles for designing ethics checklists that can be readily operationalized in AI DevOps processes. The research was conducted in conjunction with Microsoft’s Aether Working Group on Bias and Fairness. The authors drew from existing AI ethics checklists, exploratory interviews with ML practitioners, and dozens of 90-minute co-design workshops. See the VentureBeat article here and research paper here.

Microsoft Research AI Ethics Checklist Crafts Principles for Designing AI DevOps Processes

Analyst Take: Ethics is an exceptionally slippery concept to operationalize. An AI-driven application may commit ethical and cultural faux pas that offend some segments of the population. But that same application’s behavior may be regarded as positive or inoffensive by others.

AI Ethics has been Mired in Well-meaning Enterprise Bureaucracy

As we examine the anxieties behind the “ethical AI” movement, we must ask whether they’ve been hyped out of proportion by mainstream culture. We must also ask whether the approaches being proposed for instilling ethics in the business AI development and operations are truly taking hold or are likely to ensure that ethically dubious AI applications never see the light of day.

Many organizations have published high-level principles intended to guide the ethical development and deployment of AI systems. One of the things that concerns me about today’s AI ethics mania is the top-down nature of how it’s being addressed by corporate management. Indeed, it’s not hard to start rolling your eyes when you consider how AI ethics is being addressed in the business world right now. Many organizations have resorted to any or all of the following tactics:

  • Hire a company AI ethics officer
  • Establish an AI ethics oversight board
  • Publish an AI code of ethics
  • Conduct regular AI ethics audits
  • Survey stakeholders on AI ethics matters
  • Require employees to receive AI ethics training
  • Institute AI ethics whistleblowing processes
  • Organize AI ethics centers of excellence

AI DevOps Controls are Where Ethics-relevant Metrics Should be Enforced

These tactics strike me as a well-meaning attempt to add new layers of red tape, meetings, and documentation that will have little input into AI development and operations processes. To the extent that these procedures become binding mandates on the AI development process, the bureaucratic overkill could foster cynicism around the need for AI ethics safeguards of any sort.

Working data scientists should build ethics-assurance checklists into the controls that govern their DevOps workflows. This would be preferable to how AI ethics has generally been handled in the business world, as a top-down exercise involving well-meaning—albeit often ineffectual–committees, working groups, and other teams that produce vague guidance of little use to AI DevOps professionals.

If enterprises truly want to ensure that ethics-friendly AI apps become standard within the organization, the appropriate governance controls must be baked into the tools and platforms that drive DevOps workflows. Ideally, these ethics safeguards should be automated, enforced, and monitored at every stage in the AI DevOps lifecycle.

On a high level, AI ethics checklists should govern how enterprises apply these safeguards globally. As organizations incorporate ethics safeguards into the AI DevOps pipeline, they should heed the following checklist:

  • Incorporate a full range of regulatory-compliant controls on access, use, and modeling of personally identifiable information in AI applications.
  • Ensure that developers consider the downstream risks of relying on specific AI algorithms or models—such as facial recognition—whose intended benign use (such as authenticating user logins) could also be vulnerable to abuse in “dual-use” scenarios (such as targeting specific demographics to their disadvantage).
  • Instrument your AI DevOps processes with an immutable audit log to ensure visibility into every data element, model variable, development task, and operational process that was used to build, train, deploy, and administer ethically aligned apps.
  • Institute procedures to ensure the explainability in plain language of every AI DevOps task, intermediate work product, and deliverable apps in terms of its relevance to the relevant ethical constraints or objectives.
  • Implement quality-control checkpoints in the AI DevOps process in which further reviews and vetting are done to verify that there remain no hidden vulnerabilities—such as biased second-order feature correlations—that might undermine the ethical objectives being sought.
  • Integrate ethics-relevant feedback from subject matter experts and stakeholders into the collaboration, testing, and evaluation processes surrounding iterative development of AI applications.

Incorporate Ethics Metrics into ML Model Assurance Practices

Ethics is simply another set of metrics to be addressed in an enterprise AI DevOps process.

Ideally, organizations should also evaluate whether ML model assurance tools incorporate make such metrics operationalizable in their data science practices.

Model assurance is the ability to determine whether a model remains fit for its assigned task, whether it be recognizing a face, understanding human speech, predicting customer behavior, or something else. Unfortunately, few of today’s ML model assurance tools incorporate explicit controls on the ethical behavior of ML-based apps. Some model assurance environments enable developers to detect and prevent biases and privacy violations associated with ML models, and those are indeed ethical concerns.

However, few tools put those concerns into a broader ethics-assurance context. Instead, solutions such as Google AI Platform, Driverless AI, Microsoft Azure Machine Learning MLOps, Amazon SageMaker Model Monitor, and AI Assurance are primarily designed to spot model inaccuracy, decay, vulnerability, security, and explainability.

It’s up to every ML developer, whether or not they use such tools, to use management best practices that ensure their handiwork doesn’t encroach on ethics concerns in the broadest context. Developing a very good, concrete, operationalizable checklist that spans the AI DevOps pipeline would be ideal.

Ethics Checklists are Great, but Come With Their own set of Challenges

As noted in Microsoft Research’s cited study, AI ethics checklists are most useful within well-defined scopes. For example, many groups have narrowed the scope of their AI-ethics checklists to bias, privacy, accountability, transparency, interpretability, and other concerns on which there are crisp best practices that can be addressed in line with consensus frameworks.

However, some AI use cases don’t have obvious ethical concerns that can be translated into clear procedural rules for DevOps professionals to follow. According to the authors, one researcher found that AI practitioners felt that ethical metrics such as fairness were difficult to operationalize for many real-world AI systems, including chatbots, search engines, and automated writing evaluation systems.

Indeed, translating many AI application requirements into ethics-relevant metrics that can be operationalized crisply in checklists may be very difficult. The authors point out that AI ethics is a complex sociocultural concern that seldom can be reduced to a “set of simple, agreed-upon, and known-in-advance technical actions that can be distilled into ‘yes or no’ questions to be implemented by individual practitioners.”

Also, ethics checklists can be counterproductive when they over specify and thereby attempt to micromanage the relevant AI DevOps processes. The authors found that “when checklists have been introduced in other domains, such as structural engineering, aviation, and medicine,” their awkwardness and the fact that they’ve been developed “without involving practitioners in their design or implementation” often leads to being misused, misunderstood, or ignored entirely.

Likewise, AI ethics checklists can do a disservice if they reduce DevOps procedures to cut-and-dried rules and ignore the many gray areas that require ad-hoc and contextual human judgment. As the authors state, “AI ethics principles can place practitioners in a challenging moral bind by establishing ethical responsibilities to different stakeholders without offering any guidance on how to navigate tradeoffs when these stakeholders’ needs or expectations conflict.

The Takeaway — Microsoft Research’s Effort is Helpful, yet for AI Practitioners, There’s More to be Done. Model Assurance Needs Ethics Metrics to Guide the Entire Lifecycle

We should commend Microsoft Research’s recent effort to catalyze consensus within the practitioner community for the purpose of developing clear principles for designing operationalizable AI ethics checklists.

Though the researchers don’t publish a one-size-fits-all operationalizable AI ethics checklist, they provide a useful discussion of the scenarios within which checklists can be useful, and also within which they can be counterproductive or irrelevant to their intended users.

However, as the cited initiative’s research paper states, “the abstract nature of AI ethics principles makes them difficult for practitioners to operationalize.” As befits the scope of this topic, AI applications and tool vendors are still trying to bring their ethics-assurance frameworks into coherent shape. Everybody—even the supposed experts—are groping for a consensus approach and practical tools to make ethics assurance a core component of AI DevOps governance.

Model assurance needs ethics metrics to guide the entire AI lifecycle. Though there are a growing range of AI DevOps and model-assurance tools on the market, few have been explicitly designed with comprehensive ethics guardrails. However, that shouldn’t stop AI practitioners from pressing vendors to add these capabilities to their offerings.

Furthermore, AI practitioners may have acquired key AI applications from software as a service providers and outsourcers of all types. If so, one should ask them to provide full disclosure of their own practices—such as ethics officers, oversight boards, codes of conduct, audit logs, and automated controls—for ensuring their alignment with your organization’s AI ethics principles.

Futurum Research provides industry research and analysis. These columns are for educational purposes only and should not be considered in any way investment advice.

Other insights from the team at Futurum Research:

Remote Work Tools Are Hot, But Keep Data Privacy in Mind 

HPE 5G Core Stack: New Opening and Opportunity Targeting the Operator 5G Core 

Artificial Intelligence Predictions for 2020

Image Credit: ComputerWorld

Author Information

James has held analyst and consulting positions at SiliconANGLE/Wikibon, Forrester Research, Current Analysis and the Burton Group. He is an industry veteran, having held marketing and product management positions at IBM, Exostar, and LCC. He is a widely published business technology author, has published several books on enterprise technology, and contributes regularly to InformationWeek, InfoWorld, Datanami, Dataversity, and other publications.


Latest Insights:

All-Day Comfort and Battery Life Help Workers Stay Productive on Meetings and Calls
Keith Kirkpatrick, Research Director with The Futurum Group, reviews HP Poly’s Voyager Free 20 earbuds, covering its features and functionality and assessing the product’s ability to meet the needs of today’s collaboration-focused workers.
Paul Nashawaty, Practice Lead at The Futurum Group, shares his insights on the Aviatrix and Megaport partnership to simplify and secure hybrid and multicloud networking.
Paul Nashawaty, Practice Lead at The Futurum Group, shares his insights on AWS New York Summit 2024 and the democratizing of Generative AI.
Vendor Leverages Amazon Q on AWS to Drive Productivity and Access to Organizational Knowledge
The Futurum Group’s Daniel Newman and Keith Kirkpatrick cover SmartSheet’s use of Amazon Q to power its @AskMe chatbot, and discuss how the implementation should serve as a model for other companies seeking to deploy a gen AI chatbot.