Permanent link to this page

Registration information

   Tag             20853 (quoted)
   Data Item       any
   Semantics       description of the value instead of the value itself
   Reference       https://cbor.is4.site/quoted
   Contact         IS4 <is4.site@gmail.com>

Summary

This tag is intended to be used by CBOR-based data formats to indicate a quoted value, i.e. a description or construction of a value instead of just the raw value itself.

Rationale

Some programming languages or environments are able to treat their syntactical elements as first-class objects offering description or analysis thereof. For example, languages that support reflection may offer representing types or functions as objects that store their identity and allow introspection, or languages utilizing Abstract Syntax Trees allow encoding arbitrary objects using inspectable code that performs their construction.

This kind of semantical shift, from language to metalanguage, can be represented by a tag, enabling to encode such objects using existing mechanisms in CBOR to represent the elements they construct or refer to, allowing abstractions over data representations. By using this tag, implementations can decode not just the data, but also how the data is meant to be reconstructed or understood.

Semantics

The interpretational value of this tag is distinct from that of the tag content, and represents the particular description or method of construction of the enclosed value, as indicated by its encoding. That is, it preserves the structure or description of the tagged value in a way that supports its reconstruction as well as subsequent inspection of the process of its construction.

Since the tag is typologically distinct from its content (unless it encloses another quoted value), it does not stand for its content (unlike, for example, tag 28 (shareable)) and therefore SHOULD NOT be used in places where only non-quoted values are expected. For example, a quoted integer cannot be used instead of a normal integer without a change in meaning, as it identifies the particular method used to construct the integer instead of just its value.

The interpretation is intended to provide information about the value it encloses in a way that retains certain characteristics of its encoding that would otherwise be lost to the interpretation of the value alone, such as the lengths of integral values, their original tags, or the order of map keys. Additionally, if value or string sharing is used, the quoted value MAY be decoded in a way that preserves the identity of shared strings or values in places where only their values would normally be retained.

The precise way a decoder exposes a quoted value depends on the capabilities of the target language (examples below) and MAY retain only some characteristics of the encoding, to make the result natural or convenient to use in the specific programming environment. For example, if different fixed-width integer types (such as int32_t or uint32_t in C) are supported by the target language, they MAY be used to indicate the original length and sign of the encoded CBOR integer, even though some encodings coalesce to the same type (such as the value -129 encoded as either 38 80 or 39 0080 mapping in both cases to int16_t in C, the smallest integer type capable of holding the value).

As a consequence, independent implementations supporting this tag do not need to agree on precisely how much information is preserved in quoted values, the only requirement being that a quoted value can be distinguished from a plain (unquoted) one, and that it supports retrieval of the original value in some equivalent form. If this tag is used in a protocol, implementers need to ensure that the sender and receiver retain all information relevant to the protocol.

This tag is similar to tag 24 (encoded CBOR data item) in that it refers to a particular encoding or representation of a value rather than the value itself, but it is not limited to exposing the value as a CBOR stream – the construction of the value can use any operations supported by the program, as long as the correct value can be obtained.

This tag is also similar to tag 22098 (indirection) by indicating a lifting operation whose result could eventually be lowered to the original value, but it operates not in terms of references, but descriptions (on a meta- level).

Examples

The following examples illustrate the intended interpretation of this tag, but do not mandate any particular conformance of encoders or decoders supporting this tag in these contexts. It should be noted that if the quoted tag is used in a context where distinguishing and preserving the particular encoding of individually decoded CBOR values is of importance to the processor, as illustrated in the following examples, the particular CBOR decoder's involvement in the processing of the tag is necessary.

[
  20853([1, 2]),
  20853([_ 1, 2])
]
82          # array(2)
   D9 5175  # tag(20853)
      82    # array(2)
         01 # unsigned(1)
         02 # unsigned(2)
   D9 5175  # tag(20853)
      9F    # array(*)
         01 # unsigned(1)
         02 # unsigned(2)
         FF # primitive(*)

This example encodes an array containing two descriptions of equivalent arrays, first encoded with a definite length, and then using the indefinite-length encoding. Instead of storing plain arrays, this stores descriptions of such arrays, which are exposed using a distinct type that MAY retain the information whether the arrays were encoded using a definite or indefinite length. For example, in languages such as C# or Java, one MAY use the Array type for the definite-length case (indicating fixed size), while using the List/ArrayList type for the indefinite-length array (indicating flexible size).

[
  20853(4([0, 10])),
  20853(4([1, 1]))
]
82             # array(2)
   D9 5175     # tag(20853)
      C4       # tag(4)
         82    # array(2)
            00 # unsigned(0)
            0A # unsigned(10)
   D9 5175     # tag(20853)
      C4       # tag(4)
         82    # array(2)
            01 # unsigned(1)
            01 # unsigned(1)

This example stores two descriptions of equivalent decimal fractions with the value 10, first encoded as 10×10⁰, and then as 1×10¹. While these two numbers may normally be indistinguishable to an application decoding them, quoting their construction MAY result in an object that exposes the exponent and mantissa individually, allowing differentiation of such values. Such an object might be represented as an Abstract Syntax Tree in languages that supported them, referring individually to the multiplication and exponentiation operations used to construct the number.

Other concrete examples of where quoting could apply:

Security considerations

Applications using this tag to represent expressions or other pieces of executable or evaluable content MUST ensure that arbitrary code coming from untrusted sources is never put in a location where it could be executed as a part of evaluating the expression.