Content Identifiers (CIDs)
A content identifier, or CID, is a label used to point to material in IPFS. It doesn’t indicate where the content is stored, but it forms a kind of address based on the content itself. CIDs are short, regardless of the size of their underlying content.
CIDs are based on the content’s cryptographic hash. That means:
- Any difference in content will produce a different CID and
- The same piece of content added to two different IPFS nodes using the same settings will produce exactly the same CID.
CIDs can take a few different forms with different encoding bases or CID versions. Many of the existing IPFS tools still generate v0 CIDs, although the
files (MFS) and
object operations now use CIDv1 by default.
When IPFS was first designed, we used base 58-encoded multihashes as the content identifiers (This is simpler, but much less flexible than newer CIDs). CIDv0 is still used by default for many IPFS operations, so you should generally try to support v0.
If a CID is 46 characters starting with “Qm”, it’s a CIDv0 (for more details, check the decoding algorithm in the CID specification).
CID v1 contains some leading identifiers that clarify exactly which representation is used, along with the content-hash itself. These include:
- A multibase prefix, specifying the encoding used for the remainder of the CID
- A CID version identifier, which indicates which version of CID this is
- A multicodec identifier, indicating the format of the target content — it helps people and software to know how to interpret that content after the content is fetched
These leading identifiers also provide forward-compatibility, supporting different formats to be used in future versions of CID.
You can use the first few bytes of the CID to interpret the remainder of the content address and know how to decode the content after it’s fetched from IPFS. For more details, check out the CID specification. It includes a decoding algorithm and links to existing software implementations for decoding CIDs.