Australia’s modern slavery laws present an opportunity to stop artificial intelligence models from spreading misinformation, according to a new report that warns state actors and cyber criminals could “poison” AI by manipulating datasets.
The report from the Cybersecurity Cooperative Research Centre, titled ‘Poison the Well: AI, Data Integrity, and Emerging Cyber Threats’, also urges the government to adopt better oversight of datasets used to train AI, similar to the European Union’s forthcoming AI Act.
In Australia, firms earning more than $100 million a year are required to produce annual reports scrutinising supply chains for modern slavery risks under the Modern Slavery Act, which the CSCRC believes could be leveraged to monitor AI development. The objective is to mitigate the potential risk of data poisoning and label poisoning.
“For example, captured entities that are using AI systems in their operations may be required to provide details of third-parties through which AI technologies are procured and produced. And for businesses developing their own AI systems, details of training data sets and their origin could be provided,” the report reads.
Data poisoning refers to the creation of websites, or editing of existing websites, to contaminate a dataset with malicious data when the internet is scraped for AI training data.
Meanwhile, label poisoning refers to the coercion of workers, predominantly in developing countries, tasked with labelling large language model data to deliberately mislabel harmful data so that it is included in the final AI model. This might include labelling abuse material as “something harmless, like a tree, cat or bag”.
Only 0.01 per cent of the data in a training set needs to be manipulated for “targeted mistakes to be introduced in model behaviour”, according to a study supported by Google and ETH Zurich, referenced in the CSCRC paper, which proposes how a poisoning attack might occur.
However, data and label poisoning are emerging threats and there is no evidence of these attacks having occurred.
When asked if the details of a training dataset revealed through the Modern Slavery Act could be sufficient to meet the CSCRC’s transparency objectives, chief executive Rachael Falk noted said “it is up to government to work with industry to co-design guidance related to AI and the Modern Slavery Act”.
The report notes that the AI supply chain is currently opaque with “most of the companies developing LLMs do not release detailed information about the data sets that have been used to build their models”.
“As recently highlighted by the UK’s National Cyber Security Centre (NCSC), AI models are only as good as the data they are trained on. And this is where things get blurry, because there is often a lack of transparency as to where training data comes from and, therefore, its integrity is a key issue for consideration,” the report reads.
The report recommends adopting a broad provision from the EU AI Act as a “pragmatic first step in achieving regulatory oversight”.
“Article 10 of the EU’s proposed AI Act provides a clear guide in relation to data set oversight that is being in considered. In this context, AI data set transparency would be required in relation to ‘high risk’ AI models and may offer a good starting point for Australian regulators to work from,” CSCRC chief executive Rachael Falk told InnovationAus.com.
High-risk AI systems are those that “negatively affect safety or fundamental rights”, used in products under the EU’s product safety legislation or are in one of eight specifically named areas including biometric identification, and education and vocational training.
Under the EU Act, high-risk AI models will be broadly subject to “appropriate data governance and management practices, be relevant, representative, free of errors and complete, and take into account, to the extent required by the intended purpose, the characteristics or elements that are particular to the specific geographical, behavioural or functional setting”.
Industry and Science minister Ed Husic departed for the United Kingdom on Monday, where he will attend an AI Safety Summit hosted by UK Prime Minister Rishi Sunak.
Overnight, the G7 reportedly signed an agreement to establish a voluntary code of conduct for companies developing AI, which the CSCRC would back if implemented in Australia.
Mr Husic joins talks at the global stage as the Department of Industry, Science, and Resources considers regulatory reforms for supporting responsible AI in Australia.
Do you know more? Contact James Riley via Email.
Hi there, do we need to talk?
RE: Use Modern Slavery Act for AI data oversight: Cybersecurity CRC
AI is not necessarily a problem… yet.
Are we all, media style, jumping on the bandwagon to hysterically AVOID the elephant in the room? The secrets act (do we in OZ have one or is it just a plethora of treaties with our needy/greedy allies that is to blame?), maybe deals with this issue, but it is not transparent or enacted to protect the vulnerable and innocent SLAVES. Professional duties ie feature film and doco producers also are obliged, AGAIN tick off, on the legitimacy of their story lines and IP chain. This must be the twentieth centuries biggest untold horror joke and if the truth were known it would break Hollywood beyond any recovery.
With the advent of the internet this HUMAN acquisition of IP must be said to have accelerated.
Is this problem social in that the needy/greedy are laundered into prominence?
Again, do we need to talk?
I am
Andrea M Koch
sole director
ARKSpark PTY LTD