IPFS does not provide a solution to tackle the file versioning issue. IPNS and DNS links can resolve part of the problem but are far from a perfect solution as they do not support version traceability. ISCN, a metadata registry, fits the purpose.
The IPFS hash of the content as the content fingerprint is a unique signature for distinguishing files on IPFS. The hash looks totally different between the file versions even if just a pixel has been changed. It is a convenient feature of IPFS for content authenticity check.
For instance, the content fingerprint is totally different for the below two text file versions, which only have a space character difference.
Original version: QmQPzjhk9Eqy7vJWhaheyg4NiBxAz51WXUTsx92STBTs1j
Updated version (with a space character added): QmTQ6xEFKFSvRTxncf46CvYQf1rnRnfduKM7VepKyGo3LR
The feature leads to an issue however. Say if someone has backup an important video clip to IPFS, and then they updates this clip to a better version with higher quality somedays later. As the two versions are two unique files with different content fingerprints on IPFS, and as the web address of the files are composed by the content fingerprints, the ever-changing URL of versions suffocate the distribution of the file since users cannot follow the most updated link.
There is a basic solution for the above issue: IPNS (InterPlanetary Name Service).
IPNS is not a complete solution for versioning
IPNS is a pointer to a variable content fingerprint by a fixed public key hash. The file owner can change the destination of the IPNS pointer to an updated version so that visitors can always retrieve the content fingerprint of the most updated version.
The IPNS public key looks similar to an IPFS hash, which is also a non-human-readable hash string, e.g.:
Users can retrieve the content by IPNS protocol like this:
or via the IPFS public gateway, such as:
IPNS lets users to retrieve the most updated content version by just one single address, but has yet not resolved the file version management problem completely as some other important features are needed.
Say there is an important constitutional document, such as the Basic Law of HKSAR, published on IPFS. Although the most updated version of the document can be retrieved via IPNS address, there is no way for the readers to check what has been modified in the old versions. This issue is big because citizens cannot reference to what the government has committed in the past, hence cannot get the context of the constitution law, and even don’t have the details change log visible.
The complete version-management feature is much more than a simple naming service. The design of IPNS is not mainly for version-management anyway.
We can refer to GitHub for a more complete file-versioning system. GitHub provides sophisticated versioning features. The features seems to overkill most use cases in daily life however, but provides some insights for us to understand what should be included in a file-versioning system:
- a unique identifier is defined for each version;
- includes change log details such as timestamp, diffs, change remarks, etc.;
- each version is traceable and can be reverted back
A version database for IPFS files
The way to build a versioning database for IPFS is quite straight-forward, it is a data structure with the unique version identifier as key, follow by the change log as metadata of each version. All records of the same content share the same content ID with the content fingerprint of that particular version. The common content ID is important for searching, similar to the feature of IPNS.
In fact, this kind of versioning data structure has been widely adopted in many daily-life scenarios such as ISBN for publishing. However, there are two common issues: proprietary systems maintain their own versioning scheme that makes information-exchange difficult; and the metadata is controlled by a few organisations/platforms with high risk of data tampering. That is where blockchain comes into the play of file-versioning DB.
ISCN: version management for decentralized publishing
ISCN is an infrastructure for registering content metadata, including but not limited to file version information, on the LikeCoin chain. ISCN provides a few simple but sufficient fields for record versioning:
- iscn:Record: a unique ISCN ID for each piece of content
- iscn:RecordVersion: version number of the iscn record, managed by the chain
- iscn:ContentFingerprints: Content fingerprints, can be IPFS/IPNS hash, or any URL
- iscn:recordParentIPLD: a pointer to the previous version of ISCN record to let users retrieve older versions of files
There are significant differences between storing versioning data in legacy database systems and on blockchain. The data stored on LikeCoin chain is public and immutable, that makes every changes traceable. It is very important for those content with public implication such as government documentary, press footages and historical records.
Implementing version-control mechanism on chain, is to move the incompatible systems that are built on different proprietary platforms on step downwards to the protocol layer. ISCN serves as the tie between data silos.
As the governance of LikeCoin chain is based on liquid democracy among token holders, disputes can be resolved by democratic procedures.
The content versioning problem can be resolved perfectly by ISCN. In fact, ISCN is a general content metadata infrastructure and can work with any distributed file system other than IPFS – such as Arweave, or any legacy centralized storage.