ISBN is not only a unique identifier for publications but also bears some corresponding metadata, such as primary as book titles, authors, as well as information about language, publisher, place and time of publication. With this identifier and the corresponding metadata, it can help to build a global sales/management system, and it can even use to establish a system for authors and illustrators to obtain remuneration that based on the ratio of the number of times their publications borrowed from public libraries.
So, what is the purpose of a registration database for digital content? First, make some assumptions:
- There is already a massive database of digital content registration in the world.
- Most of the creators can register their digital content to this database naturally with low cost at the same time while completing the creation.
- This unique identifier for the digital content is displayed markedly on the distributing media.
Based on the above assumptions, some application scenarios will arise immediately:
- To inspect the copyright of digital content.
- To divide the profits to all stakeholders of the digital content.
- Proof-of-Existence of digital content.
Copyright-terms of a digital content
The Internet nowadays, the vast majority of content users do not pay much attention to the copyright of digital content, and the reason is too challenging to find for such users. It just leads them to give up after countless failures.
Let me show you how tricky to find copyright information for the meme “Philosoraptor”.
It took 10 seconds to find tons of the above image. Followed with 5 minutes to read knowyourmeme.com and learned that the original illustration of the Philosoraptor was released and copyrighted by creator Sam Smith as a T-shirt design for sale on the online retailer Lonely Dinosaur. After Smith revealed that it became a famous meme, he emailed Lonely Dinosaur:
We’re not exactly sure who started putting text over it, and far be it from us to try and control a meme… We put a creative commons non-commercial license on it, so all your stuff is cool and we think what you’re doing is great, but now everyone thinks that the shirt we’re selling is just something we cut and pasted off the web, which kind of sucks for us. It’d be cool to see if we could put our side of the story out there and see if we can find the person who first put text over it, and have a complete history of origin of the meme.
It seems that the creator changed the copyright to CC BY-NC; however, it took me 30 more minutes to verify the authenticity of the above message although I failed, and I even could not find Sam Smith’s contact. I also could not tell which version of CC BY-NC, and I finally surrendered. A person who expects to respect the copyright is still not sure about the copyright of digital content, how would expect on the regular internet users.
When there is a way to check the copyright quickly, the people who are not concerned about the copyright may not become concerning, but for the point of view of the enterprise, it is a significant improvement. The Directive on Copyright in the Digital Single Market approved by the European Parliament Committee on March 26, 2019, the most controversial draft Article 13 requires each website:
- Get permission to use copyrighted work.
- Otherwise, those websites must prevent infringing works from appearing.
All major UGC platforms such as YouTube, Facebook, etc. will inevitably use AI to identify all the infringing works even though it may catch an innocent to avoid to pay considerable fees to various types of copyright holders to obtain the permission. The content on those platforms will necessarily hinder creative freedom. Moreover, the small scale UGC platforms cannot afford the research and development of the AI filters; thus, such operator can only choose to wind up their business in the EU. It is a three-lost situation between the creator, the audience and the platform, and this is why this article is controversial. The fundamental cause of such an outcome is because of creative works are not be easy to inspect the copyright-terms. Otherwise, first of all, there will be no innocent. Also, even the platforms are treating the copyrighted works, and they can immediately check whether the creator obtains the copyright. Everyone is now happy, the copyright is respected, the freedom of creation is protected, and the platforms do not have to spend money to develop AI filters meaninglessly.
To talk about profit sharing, we must first clarify the relationship between all stakeholders in digital content. Depending on the nature of digital content, it can be extremely complicated or straightforward. For example, if you create a short story and put it on an individual content platform, then you are the author and the platform is the publisher. In principle, the sharing ratio is a deal between these two stakeholders only. But if nowadays you post your story on some big social platforms, they do take all of your traffic and all the profits from advertising, and also no one can give you a cent even someone thinks that you just made a brilliant work, but he/she can only leave a comment or a “like”, you get nothing from your humble creation and are still as poor as a church mouse. However, if the metadata includes the stakeholder information and the sharing ratio, even if the platform robs the revenue caused by the traffic generated by the author, they may still rely on a sharing mechanism that operates by a third party based on the sharing ratio in the metadata.
A more complicated scenario is that a photographer took a picture, an illustrator added an illustration to this photo, and you wrote an article using this processed photo as a cover image, and publish on a platform. All stakeholders include the first photographer, the second illustrator, the third author and the final publishing platform. When the data of these stakeholders are altogether to record in the metadata with the sharing ratio, and when there is the profit generated, there can have a mechanism to calculate and distribute the revenue directly to the relevant stakeholders. Once we have this metadata, even if it is a very complicated stakeholder relationship, there will have enough information to design a corresponding sharing mechanism.
Proof of Existence
The original digital Proof of Existence used the Bitcoin blockchain to upload some digital files, and at the same time, the Bitcoin blockchain recorded a tamper-proof timestamp on the block. Furthermore, the characteristics of the blockchain require the uploader to sign and confirm with his digital key for uploading. In this case, as long as anyone can demonstrate that he owns the digital keys, it indirectly proves that he is the original uploader. Therefore, the following conditions establish a Proof of Existence for registering digital content:
- Registration requires a digital key for signing a confirmation.
- The registered content contains the content itself, or even a simple digital fingerprint representing the content.
- Record timestamps in a non-tamperable mechanism.
This Proof of Existence can only prove who uploaded what at when, so if a certain digital content has already been circulating in a public area, no one can confirm whether this uploader is still the author. Hence the best practice is registering the creation immediately once it is completed and before it is released. But here requires a handy and very low-cost registration mechanism, otherwise who will spend a high time or money cost to register a content that is not valuable enough?
If a registration database for digital content can achieve what the above mentioned, then we go back to ask the original question “Why doesn’t digital content have a registration database like ISBN?” If it needs to establish a registration database for digital content, some technical problems need to be solved, and the most complicate two issues are as follow:
- How to make the entire set of registration database credible.
- Need to define clearly the specifications of the registered data structure.
A credible database
If the registration database covers copyright information, profit sharing, and even Proof of Existence, which are all closely related to the interests of stakeholders of digital content. Then this system from registration to the storage, and then to query, it requires credibility, and the information recorded is convincing that has not been tampered but was also proposed or changed by the relevant stakeholders. To build up the before-mentioned system, we cannot merely build a classic backend with a database as no one can verify the completeness and correctness of the metadata.
Blockchain technology is feasible and satisfies the above conditions. First of all, all registration transactions need to sign by the digital key of the uploader, and every block will also record the timestamp. Therefore, we can know who at when to register the specific digital content. And then, blockchain records not only the registration transaction, but also the changing transaction, that means anyone can trace back all modifications for given metadata. Finally, all blockchain block data are tamper-proof, but this is more technical to explain, and it will not be detailed here.
Data structure specifications for digital content metadata
To achieve various forms of effects, a registration of digital content need to record a lot of different types of metadata. Most of the time, this set of metadata is used for the computational process rather than downloaded by a human for review. Thus, a strict and precise specification is necessary. As a result, we are proposing a specification specifically designed for digital content registration, and call it International Standard Content Number (ISCN for short).
The purpose of ISCN is to regulate what information and the format that registers by a digital content registration. The current design of the specification divides into four parts. The first layer contains the kernel and records the unique identifier, while the second layer includes the digital content holders, copyright and the metadata of the digital content itself. The general structure of metadata is as follows:
The first proposal of the specification on GitHub, and please feel free to leave a comment and help to build up a digital content registration specification together.