Proposal for Proper Use of Image Generative AI and Proper Application of the Copyright Law (Original version, English translation)

Background

Recently, artificial intelligence (AI) has evolved at an exponential pace, and the development of diffusion models that generate images from noise and the emergence of Stable Diffusion, an image generative AI model based on it, have exploded the use of image generative AI in public. On the other hand, appropriate guidelines and legal systems for the use of such new technologies are immature, and various problems have occurred.

Specifically, these problems include (1) the case where non-copyright holder have the image generative AI learn the copyrighted work of a specific illustrator without the permission from the copyright holder, and make public an image that imitates the characteristics of the illustrator’s work and a trained AI model that can generate it¹⁾ and (2) the case where non-copyright holder directly modify the copyrighted work with image generative AI without permission and publish it as if it were a different work²⁾. The data set used for machine learning of Stable Diffusion and image generative AI models derived from Stable Diffusion includes data from Danbooru, an image posting website which is known for a problem of re-posting without permission of the copyright holder, and many unauthorized screenshots of copyrighted works such as animation. These data are used for machine learning without the permission of the copyright holder, which is considered a fundamental problem of an image generative AI.

Originally, when using a copyrighted work, the rights of authors are protected by the Copyright Law. However, the revised Copyright Law which was enforced on January 1, 2019, states in Article 30-4 that exploitation of copyrighted works is permitted when used for informational analysis. This clause was set to allow copyrighted works to be used without permission for informational analysis such as machine learning by AI, as long as the use does not adversely affect the market for copyrighted works. Therefore, the article accompanies a proviso stating that “this does not apply if the action would unreasonably prejudice the interests of the copyright holder in light of the nature or purpose of the work or the circumstances of its exploitation”. Therefore, considering the type, use, and mode of use of the copyrighted work used in machine learning, it goes out of scope of the permission of exploitation of the copyrighted work if it unreasonably harms the interests of the copyright holder. There is controversy as to whether machine learning of an image generative AI infringe this proviso, and no conclusion has reached. As a similar example, Echi pointed out the possibility of that the act of using AI machine learning on Disney movies for the purpose of creating a movie unreasonably harms the interests of copyright holders³⁾. However, without waiting for the maturity of the discussion of legitimacy of AI machine learning on copyrighted works, a one-sided interpretation has spread, the interpretation that the use of copyrighted works in machine learning of an image generative AI is legally acceptable, and the problem of (1) above has been left unaddressed. Recently, even some people have made money by selling adult images that imitate the characteristics of a specific creator’s work using trained AI models. In this situation, the original creator has not been compensated for the unauthorized use of the copyrighted work, and even their trained skill has been stolen and misused.

The problem described in (2) above is a clear copyright infringement from the viewpoint of dependability and similarity, which are elements of copyright infringement, but it gets difficult to prove by deleting the metadata of the AI generated image and the information of the generation process. While a large number of images are generated by image generative AI every day, it is difficult for each copyright holder to solve problems individually, and there are some cases where they are unfairly attacked when pointing out copyright infringements. As a result, there are many cases where the copyright holder gave up and did not report the issue.

These problems with image generative AI were expected before the revision of the Copyright Law in 2019. At the New Information Property Review Committee held in the Prime Minister’s Office from 2016 to 2017, emergence of AI that generates music, paintings, illustrations, etc. was reported, and the committee discussed potential future issues including the case where AI products infringe the copyrights of others and the case where AI products are misused⁴⁾. However, since there had been not many specific problems at that time, the committee decided to continue discussion on specific cases related to AI products while paying close attention to changes in AI technology landscape, considering its speed of changes. The committee also discussed the development of an ecosystem to promote the AI usage and requested the development of the regulatory framework of flexible copyrights restriction to promote AI machine learning. This request of the regulatory framework of flexible copyright restriction to meet the needs of AI machine learning was discussed at the Council for Cultural Affairs, and reflected in the revised Copyright Law, which was enacted on May 18, 2018, and enforced on January 1, 2019. In the discussion on the revision of the Copyright Law at the Council for Cultural Affairs, the scope of copyright restriction is assumed to be the use called “the first layer use” (types of use that do not usually harm the interests of the copyright holder) and “the second layer use” (types of use that has minor impact on the interests of the copyright holder). The first layer use, to which AI machine learning was categorized, includes the replication of copyrighted material to generate caches to improve network functions and caches in a computer^{5), 6)}. In this category, AI machine learning was expected to be used for extracting and utilizing metadata associated with copyrighted works by analyzing the characteristics of image data in a use of identifying objects. In the discussion, the council did not discuss the use of AI in generating image data as the image generative AI currently does, the use which generates data with similar characteristics to the copyrighted works used in AI machine learning and the use that the New Information Property Review Committee considered as future concerns^{5), 6)}. As a result, the problems described in (1) and (2) as above have not been treated with appropriate guidelines or legal systems and been left unaddressed while these were expected as of 2017.

Creating, publishing, and using image generative AI which was trained targeting specific authors’ works by others facilitates the circulation of fake of the author’s work, disrupts the market, damages the reputation of the author, and detriments the copyrighted work in general. As mentioned above, Article 30-4 of the Copyright Law has a proviso that “permission of exploitation does not apply if the action would unreasonably prejudice the interests of the copyright holder in light of the nature or purpose of the work or the circumstances of its exploitation”, however, the infringement to this provision is supposed to be finally judged at court of law from the viewpoint of “whether it conflicts with the market for the use of the copyrighted work or interferes with the potential future market of the copyrighted work”⁷⁾. While a large number of image generative AI models which were trained targeting specific authors are being created every day, it is not realistic for individual authors to spend a large amount of money and time to resolve the problem through lawsuits each time. Given the fact that image generative AI has been used in a manner that is far different from what was envisioned at the time of the revision of the Copyright Law, and that many problems have occurred as a result, it is appropriate to address the problem by setting appropriate provisions of the law and applying it.

Image generative AI is a breakthrough technology and is expected to contribute to the creative and content industries in the long-term, where creation is the center of the value. However, current confusion inherent in the dawn of new technology has a significant negative impact on the industries and requires an immediate response. For this purpose, we believe that it is essential to properly use image generative AI and properly protect the copyright materials with the Copyright Law, thus we propose the following as a basic policy.

Our proposal

Permission of exploiting copyrighted works as stipulated in Article 30-4 of Copyright Law shall not apply to the use of copyrighted works in machine learning of image generative AI such as Stable Diffusion.
- The use of copyrighted works in machine learning of image generative AI shall follow the written consent of copyright holders with an opt-in method, and permission to use the copyrighted work should be in place in advance of the use.
In the use of image generative AI, pay a license fee to the copyright holder of the work used in AI machine learning according to the number of times etc. that consumers use the AI.
Copyright protection shall continue to be granted to a creatively produced expression of thoughts or sentiments as ever.
- In the case of image generative AI products, copyright protection shall not be granted to works that are wholly or mostly image generative AI products, but only to those whose creative contributions as human being are clearly recognized.

There has been and will be no doubt that AI contribute to the creative filed, however careful attention must be paid to securing right copyright protection and the return of profits to copyright holders for future advancement of creative world and industry as well as for future coexistence of human beings with AI in the light of advancement of breakthrough technologies such as image generative AI. To compensate authors for the use of image generative AI, it is essential to link the copyright holder with the copyrighted work used in AI machine learning. For this linkage, the opt-in method is appropriate to guarantee the copyright holder’s free will participation. While, to properly operate the opt-out method, it requires the copyright holder to individually check the unauthorized use of images in the data set for AI machine learning, which is said to be more than five billion images, and it is not realistic to operate. In addition, in the current situation where unauthorized free use of copyrighted works in AI machine learning is widely spread, it is difficult to build the clean image generative AI which pays license fee to copyright holders and obtains permission of copyrighted works for machine learning, by relying on people’s good will. To guide and realize the clean image generative AI which treats copyright works appropriately and to ensure the return to the copyright holder, we need to regulate by a law the image generative AI of unauthorized machine learning. Only in the environment that misused generative AI was excluded, copyright holders can create their image generative AI which has trained with their own copyrighted works, utilize it, and in some cases, license it for commercial purposes etc., and get appropriate compensation for the use of copyrighted works in image generative AI. Therefore, the use of copyrighted works in machine learning of image generative AI must not be permitted for exploitation of copyrighted works as stipulated in Article 30-4 of Copyright Law.

Image generative AI can produce massive amount of images in a short period of time and can theoretically encompass any combination of expressions that exist as electronic data. Granting copyright to these AI products or their minor modifications causes an unfair copyright monopoly by individuals or corporations that generate images in massive quantities using image generative AI, and harms creative activities. Therefore, in the case of image generative AI products, it is essential to grant copyright only to those that clearly recognize the creative input by a human being, for example, consisting mostly of work that does not use image generative AI. On the other hand, there are an increasing number of cases in which people who made no or marginal human creative input claim copyright protection by making false claims that they created the work without using image generative AI. This problem of fake content has already become apparent. In the future, it is expected that there will be many disputes regarding copyright infringement, and authors are required to prepare for proving authenticity of their artwork, such as recording their own creation process³⁾.

Considering the current situation, this proposal aims to contribute to advancing human art, creative industry, and economy. This proposal does not intend to overturn the permission of exploitation of copyrights for AI machine learning in the use that matches the first layer use (a type of use that does not usually harm the interests of the copyright holder), the use which was originally envisioned at the revision of Copyright law enacted on May 18, 2018, but to promote the proper use of image generative AI which has been used in a manner far different from the first layer use as well as to promote the protection of the rights of copyright holders. To implement this proposal at the working level, it is necessary to advance more discussions with various stakeholders on the definition and scope of image generative AI as well as the scope of AI products to be copyrighted. Fully implementing the proposal may require further advancement of technologies such as block-chain or non-fungible tokens (NFTs) and would take considerable amount of time to complete. On the other hand, it is reasonable to assume that this proposal may apply not only to images, but also to video, music, and other fields which may involve generative AI in future as a basic concept of the proper use of generative AI and the copyright protection.

The discussion on generative AI is emerging and yet immature, therefore we hope that this proposal serves as a catalyst for future discussions in various places and among stakeholders. As the discussion develops, this proposal will be revised as appropriate.

Developed by Group of volunteers who concerns future of creators and AI

February 18, 2023, published the original version

Reference

Civitai, https://civitai.com/tag/dreambooth. Accessed Feb 13, 2023.
Kiyoshi Shin, 2023, 「AIトレパク」が問題に, ASCII.jp, Available at https://ascii.jp/elem/000/004/121/4121719/. Accessed Feb 16, 2023.
Yasuyuki Echi, 2020, AI生成物・機械学習と著作権法, 日本弁理士会パテント2020 Vol.73 No.8（別冊No.23）.
新たな情報財検討委員会, 2017, 新たな情報財検討委員会報告書 ―データ・人工知能（AI）の利活用促進による産業競争力強化の基盤となる知財システムの構築に向けて―（平成29年3月）, Available at https://www.kantei.go.jp/jp/singi/titeki2/tyousakai/kensho_hyoka_kikaku/2017/johozai/houkokusho.pdf. Accessed Feb 16, 2023.
文化審議会著作権分科会, 2017, 文化審議会著作権分科会報告書（平成29年4月）, Available at https://www.bunka.go.jp/seisaku/bunkashingikai/chosakuken/pdf/h2904_shingi_hokokusho.pdf. Accessed Feb 16, 2023.
文化審議会著作権分科会, 2018, 著作権法の一部を改正する法律　概要説明資料, Available at https://www.bunka.go.jp/seisaku/bunkashingikai/chosakuken/bunkakai/51/pdf/r1406118_08.pdf. Accessed Feb 16, 2023.
文化庁著作権課, 2019, デジタル化・ネットワーク化の進展に対応した柔軟な権利制限規定に関する基本的な考え方（著作権法第30条の４，第47条の４及び第47条の５関係）（令和元年10月24日）. Available at https://www.bunka.go.jp/seisaku/chosakuken/hokaisei/h30_hokaisei/pdf/r1406693_17.pdf. Accessed Feb 16, 2023.