What is AI Safety

AI Safety is the interdisciplinary study of ensuring that artificial intelligence (AI) systems are designed and deployed responsibly, minimising risks and maximising benefits for humanity. This field addresses concerns about AI’s potential unintended consequences, biases, and misuse by focusing on areas like robustness, interpretability, and value alignment. Robustness emphasises creating AI that performs reliably even in uncertain conditions. Interpretability involves understanding an AI’s decision-making process. Value alignment ensures AI systems align with human values and ethical principles. AI Safety aims to develop strategies and techniques that guarantee AI systems remain beneficial, trustworthy, and under human control.

Source: No AI Robots Sign. Openclipart.org. Published 2023. Accessed April 2, 2023. https://openclipart.org/detail/340345/no-ai-robots-sign

In the context of AI, what does ‘under human control’ mean?

“Under human control” in the context of AI means that an artificial intelligence system’s decision-making, behaviour, and actions are guided, monitored, and supervised by humans. It ensures that AI systems operate within the bounds of human-defined objectives, ethical principles, and societal norms, preventing them from causing unintended harm or acting autonomously in undesirable ways. Human control includes oversight, intervention, and adjustable autonomy, enabling humans to influence, correct, or halt AI systems when necessary. Maintaining human control is vital for ensuring AI safety and promoting responsible development and deployment.

What does ‘autonomous mean’ in the context of AI

In the context of AI, “autonomous” refers to the ability of an artificial intelligence system to perform tasks, make decisions, or take actions without direct human intervention or continuous supervision. Autonomous AI systems can perceive their environment, process information, learn from experiences, and adapt to changing circumstances to achieve specific goals. The degree of autonomy can vary, ranging from simple decision-making in narrow domains to complex, general-purpose problem-solving. As AI systems become more autonomous, concerns about safety, ethics, and alignment with human values increase, making the need for responsible development and deployment of AI more critical.

Who are the influential AI ethics organisations in the United States?

  1. Partnership on AI: Founded by major tech companies like Google, Amazon, and Microsoft, the Partnership on AI aims to ensure that AI benefits humanity by conducting research, promoting best practices, and providing a platform for open collaboration on AI-related topics. https://www.partnershiponai.org/
  2. AI Now Institute: The AI Now Institute, based at New York University, focuses on the social implications of AI, advocating for responsible AI practices and policies that address bias, fairness, accountability, and transparency. AI Now Institute: https://ainowinstitute.org/
  3. Centre for Human-Compatible AI (CHAI): Affiliated with the University of California, Berkeley, CHAI researches value alignment, AI safety, and the long-term societal impact of AI, aiming to develop AI systems that are provably beneficial to humanity. https://humancompatible.ai/
  4. Future of Life Institute (FLI): FLI is a nonprofit organisation dedicated to mitigating global catastrophic risks, including those posed by advanced AI. They support research and initiatives to ensure AI development aligns with human values and is safe for society.https://futureoflife.org/

Who are the influential AI ethics organisations in Australia?

  1. Australian Human Rights Commission (AHRC): Although not exclusively focused on AI, the AHRC addresses ethical concerns related to AI and emerging technologies. They work on promoting human rights and preventing discrimination in the development and deployment of AI systems. https://humanrights.gov.au/
  2. Data61: A part of the Commonwealth Scientific and Industrial Research Organisation (CSIRO), Data61 is involved in AI research and development, including AI safety, ethics, and policy. Data61 works to create a responsible AI ecosystem in Australia. https://www.data61.csiro.au/
  3. The Responsible Artificial Intelligence (AI) Network, a world-first cross-ecosystem program to support Australian companies in using and creating AI ethically and safely, will be launched today by Minister for Industry and Science, the Hon Ed Husic MP and the National AI Centre, co-ordinated by CSIRO, Australia’s national science agency. https://www.csiro.au/naic
  4. Gradient Institute: An independent research institute, the Gradient Institute focuses on developing the theory and practice of ethical AI systems, ensuring they are designed and deployed for the benefit of all people. https://gradientinstitute.org/
  5. eSafety Commissioner: The Office of the eSafety Commissioner is an Australian government agency dedicated to promoting online safety. They address issues related to digital technology, including AI, and work to create a safer online environment for all Australians.  https://www.esafety.gov.au/

What is an example of AI Safety gone wrong

One example of AI safety gone wrong is Microsoft’s AI chatbot, Tay. Launched in March 2016, Tay was an AI-powered chatbot designed to engage in conversations with users on Twitter and learn from their interactions. The objective was for Tay to improve its conversational abilities by mimicking human-like responses.

However, within hours of its launch, Tay started posting offensive, racist, and inappropriate messages. This was due to the chatbot learning from its interactions with users who intentionally fed it harmful content. Microsoft had not implemented sufficient safety measures, such as content filtering or stricter learning mechanisms, to prevent Tay from adopting and reproducing such behaviour.

As a result, Microsoft had to take Tay offline within 24 hours of its launch. The Tay incident highlights the importance of AI safety measures, including robustness against adversarial inputs and value alignment with human ethics, to prevent AI systems from causing unintended harm or behaving undesirably.

Hern, A. (2016, March 24). Tay, Microsoft’s AI chatbot, gets a crash course in racism from Twitter. The Guardian. https://www.theguardian.com/technology/2016/mar/24/tay-microsofts-ai-chatbot-gets-a-crash-course-in-racism-from-twitter

What is an example of AI safety gone wrong in the UK?

An example of AI ethics concerns in the UK involves using an algorithm for determining A-level exam grades in 2020. Due to the COVID-19 pandemic, UK students could not take their A-level exams, which play a critical role in university admissions. In response, Ofqual, the UK’s Office of Qualifications and Examinations Regulation, developed an algorithm to predict students’ grades based on factors like their prior academic performance and the historical performance of their schools.

However, the algorithm was widely criticised for being unfair and biased. Students from disadvantaged backgrounds and lower-performing schools were disproportionately affected, as the algorithm tended to downgrade their predicted grades. This led to a public outcry, with students and families demanding a fairer approach to grading.

In response to the backlash, the UK government eventually scraped the algorithm-based grading system and relied on teacher-assessed grades instead. This incident highlights the importance of transparency, fairness, and accountability when developing and deploying AI systems, particularly when they significantly impact people’s lives.

Busby, E., & Crouch, H. (2020, August 17). A-level results: Government in humiliating U-turn as it finally ditches controversial algorithm for teacher-assessed grades. The Independent. https://www.independent.co.uk/news/education/education-news/a-level-results-algorithm-teacher-assessed-grades-gavin-williamson-ofqual-a9674611.html

What is ‘explainable AI’?

Explainable AI (XAI) refers to a subfield of artificial intelligence that focuses on developing AI systems and models that can provide human-understandable explanations for their decisions, predictions, or actions. The primary goal of XAI is to make AI more transparent, accountable, and trustworthy, addressing the so-called “black-box” problem where complex AI models, such as deep neural networks, can be difficult for humans to interpret and understand.

Explainable AI involves various techniques and approaches that help users comprehend why an AI system arrived at a specific output. Some standard methods include:

  1. Feature importance: Identifying and ranking the most critical input features contributing to the AI system’s decision.
  2. Local explanations: Providing explanations for specific instances or decisions, often by approximating the complex model with a simpler, more interpretable model.
  3. Global explanations: Offering a broader understanding of the AI system’s behaviour and decision-making process over various inputs
  4. Rule extraction: Deriving human-readable rules or decision trees from the AI model to help explain its decisions.

Explainable AI is critical in industries and applications where the consequences of AI decisions can have significant impacts, such as finance, healthcare, law, and self-driving vehicles. By providing better insight into the AI system’s functioning, XAI can help to build trust, facilitate collaboration between humans and AI, and ensure that AI-driven decisions are ethically and legally sound. XAI World Conference. (n.d.). XAI World Conference. Retrieved April 2, 2023, from https://xaiworldconference.com/

Black box AI refers to machine learning models that arrive at decisions or conclusions without explaining how they reached those decisions. These models are often too complex for experts to understand, making identifying and correcting errors or biases challenging. This lack of transparency can be problematic, particularly in high-stakes decision-making contexts such as healthcare, finance, and criminal justice. As a result, there is growing interest in developing explainable AI models that can provide clear and interpretable explanations for their decisions. These models are designed to be more transparent and accountable, allowing users to understand how the model arrived at its conclusions and identify potential biases or errors. The development of explainable AI is essential to ensure that AI is used ethically and responsibly and benefits society.

What are some short courses about AI and ethics?

  1. Governance, Ethics and Regulation of AI – UTS Open – This digital ethics and governance short course from UTS Open explores the use of AI in business and community contexts. It examines the laws, standards, and regulatory initiatives designed to protect users from digital hazards. https://open.uts.edu.au/uts-open/study-area/law/professional-skills/governance-ethics-and-regulation-of-ai/
  2. The Ethics of Artificial Intelligence – Melbourne MicroCert – This Melbourne MicroCert is ideal for leaders and digital professionals who want a better understanding of the opportunities and risks of AI and how these can impact organisations. https://study.unimelb.edu.au/find/microcredentials/introduction-to-the-ethics-of-artificial-intelligence/
  3. Australia’s AI Ethics Principles – Department of Industry – This voluntary framework outlines Australia’s AI ethics principles and guides organisations developing and implementing AI systems. It is not an online course, it is a publication, but it is worth studying  https://www.industry.gov.au/publications/australias-artificial-intelligence-ethics-framework/australias-ai-ethics-principles
  4. Ethics of Artificial Intelligence | Coursera – This course teaches you to identify the ethical and social impacts and implications of AI, critically analyse current policies for AI, and use ethical and socially responsible principles in your professional life. [Link: https://www.coursera.org/learn/ai-ethics]
  5. Ethics of AI: Safeguarding Humanity | Professional Education – Led by MIT thought leaders, this course will deepen your understanding of AI as you examine machine bias and other ethical risks and assess your individual and corporate responsibilities. [Link: https://professional.mit.edu/course-catalog/ethics-ai-safeguarding-humanity]
  6. Artificial Intelligence Ethics in Action | Coursera – This course by LearnQuest is part of the Ethics in the Age of AI Specialisation. It focuses on analysing ethical AI across various topics and situations. https://www.coursera.org/learn/ai-ethics-analysis

AI writing statement: This blog post has been written with the assistance of Open AI Chat GPT 4 and https://www.perplexity.ai/. This includes drafting, text arrangement (dot points, steps etc.), search, and idea generation.  Approximately 50% of the final product of this blog post reflects these contributions.



Leave a Reply