encoding
Click on the red underlined text to get to the source
...
The idea of delta encoding to reduce communication or storage costs
is not new. For example, the MPEG-1 video compression ...
... links [17] uses a clever, highly specialized form of delta encoding.
In spite of this history, it appears to have taken several years
...
...
In spite of this history, it appears to have taken several years
before anyone thought of applying delta encoding to HTTP, perhaps
because the development of HTTP ...
... because the development of HTTP caching has been somewhat haphazard.
The first published suggestion for delta encoding appears to have
been by Williams et al. in a paper about HTTP cache ...
... The WebExpress project [15] appears to be the first published
description of an implementation of delta encoding for HTTP (which
they call "differencing"). WebExpress is aimed specifically at
...
... validated its own cache entry with the origin
server. The use of optimistic deltas, unlike delta encoding,
actually increases the number of bytes sent over the network, in an
...
... of the full contents of HTTP messages, to quantify the potential
benefits of delta-encoded responses. They showed that delta encoding
can provide remarkable improvements in response-size and response-
delay for an important subset of HTTP ...
... 13]. One aspect of the DRP proposal
is the use of "differential downloading," which is essentially a form
of delta encoding. The original DRP proposal uses a different
approach than is described here, but a forthcoming revision of DRP
will be revised to conform to the proposal in this document.
...
... 28] describe the "rsync" algorithm, which
accomplishes something similar to delta encoding. In rsync, the
client breaks a cache ...
...
Nothing in this specification specifically precludes the use of
a delta encoding for the body of a PUT request. However, no
mechanism currently exists for the client to discover if the
...
... HTTP response encoded as the difference between two instances.
More formally, delta encodings are members of a potentially larger
class of transformations on instances, leading to this new term:
...
... response
message. For example, a range selection or a delta
encoding. Instance manipulations are end-to-end, and
often involve the use of a cache ...
...
Content-coding According to the specification, "Content coding
values indicate an encoding transformation that has
been or can be applied to an entity. Content codings
...
...
Transfer-coding According to the specification, "Transfer coding
values are used to indicate an encoding
transformation that has been, can be, or may need to
be applied to an entity ...
... A client signals its willingness to receive a content-coding by
sending an "Accept-Encoding" header, listing the set of content-
codings that it understands. It may optionally include information
...
... about which content-codings it prefers. If a server uses any non-
identity content-coding(s), it includes a "Content-Encoding" header
field in the response, listing these content-codings in their order
of application.
...
... 9] did not include an analogous mechanism for negotiating
the use of transfer-codings, although it does include an analogous
"Transfer-Encoding" header for marking the response. A new "TE"
...
... header (short
for "Accept-Instance-Manipulation", which is far too long to spell
out), analogous to the "Accept-Encoding" header. Similarly, a server
lists the set of instance-manipulations it has applied using an "IM ...
...
One must understand the relationship between these transformations in
order to see how delta encoding applies to HTTP responses.
...
... identity content-
coding to the instance, or one might have been inherent in its
generation. This also results in a Content-Encoding header.
...
... header, and the appropriate
range(s) from the (possibly encoded) body. Delta encodings are
instance-manipulations, and are computed at this stage.
...
... the-fly compression could be done in this step. If so, a
Transfer-Encoding header is added to the message.
...
... of an entity, not of an instance or of a variant. For example, if
the message is a delta encoding, Content-Length gives the length
of the delta encoding ...
... encoding, Content-Length gives the length
of the delta encoding, not the length of the current instance.
Diagrammatically, the sequence is:
...
... instance
| apply instance manipulation, if any
v (delta encoding, range selection, etc.)
entity ...
...
If both Ranges and delta encodings are forms of instance
manipulation, which should be applied first? This depends on how the
Range ...
... In the first use of Range, it would have to be applied after any
delta encoding, since the intended use is to recover an intact copy
of the delta-encoded instance. In the second use of Range, it would
...
... of the delta-encoded instance. In the second use of Range, it would
have to be applied before any delta encoding, because otherwise the
offsets specified in the Range ...
... Range request would be meaningless (the
client generally cannot know how a server's delta encoding maps
instance byte offsets to entity byte offsets).
...
... its ordering with respect to other instance-manipulations) whether
range selection is done before or after delta encoding.
We also need a mechanism for the server to indicate in which order
...
... IM" header (see section 10.5.2), where we
follow the same practice used for the "Content-Encoding" header: the
"IM ...
... client to control whether compression is done
before or after delta encoding, since some simple differencing
algorithms (such as the UNIX ...
...
In this section, we explain the concepts behind delta encoding. This
is not meant as a formal specification of the proposed extensions;
see section 10 for that.
...
... 3. A way to indicate that the client is able to apply one or more
specific forms of delta encoding.
4. A way to mark a response as being delta-encoded in a particular
...
... to apply deltas (aside from the trivial 0% and 100% deltas), can be
accomplished by transmitting a list of acceptable delta-encoding
formats in a request-header field; specifically, the "A-IM ...
... two formats, "diffe" (i.e., output from the UNIX "diff -e"
command), and "vcdiff". (Encoding algorithms and formats, such
as "vcdiff", are described in section 6.)
...
... based on the mandatory ordering constraint specified in section
10.5.3, if both delta encoding and compression are applied,
then this "A-IM ...
...
However, if the server does want to compute a delta, and the set of
encodings it supports has more than one encoding in common with the
set offered by the client ...
... However, if the server does want to compute a delta, and the set of
encodings it supports has more than one encoding in common with the
set offered by the client, which encoding ...
... encoding in common with the
set offered by the client, which encoding should it use? This is
mostly at the option of the server, although the client can express
...
... 10] describes qvalues in more
detail. (Clients may prefer one delta encoding format over another
that generates a smaller encoding, if the decoding costs for the
...
... Clients may prefer one delta encoding format over another
that generates a smaller encoding, if the decoding costs for the
first format are lower and the client is resource-constrained.)
...
... CPU cycles are plentiful and network bandwidth is scarce,
the server might compute each of the possible encodings and then send
the smallest result. Or the server might use heuristics to choose an
...
... the smallest result. Or the server might use heuristics to choose an
encoding format, based on things such as the content-type of the
resource, the current size of the resource, and the expected amount
...
...
A response using delta encoding must be identified as such. This is
done using the "IM" response-header ...
... many bytes to the response headers, and so would reduce the
effectiveness of delta encoding. It is also not entirely clear that
this approach suppresses all caching by all HTTP/1.0 proxies ...
... We were reluctant to define an additional status code as part of
the support for delta encoding. However, we see no other
efficient way to remain compatible with the deployed base of
HTTP/1.0 ...
... IM" response-header field, indicating which
delta encoding is used in this response.
3. Its message-body ...
...
3. Its message-body is a delta encoding of the current instance,
rather than a full copy of the instance.
...
...
and the server either responds with a 304 (Not Modified) response, or
with the appropriate delta encoding.
Here are a few more examples, to clarify how the client request ...
...
then the meaning is the same as in the example above, except that
after the delta encoding (and compression, if any) is computed, the
server then returns only the first 100 bytes of the output of the
...
... compression, if any) is computed, the
server then returns only the first 100 bytes of the output of the
delta encoding. (If it is shorter than 100 bytes, the entire delta
encoding is returned.) Because the "range ...
... delta encoding. (If it is shorter than 100 bytes, the entire delta
encoding is returned.) Because the "range" token appears last in the
...
...
The interaction between the If-Range mechanism and delta encoding is
somewhat complex. (If-Range means, informally, "if the entity ...
... The client wants to retrieve the remaining 100 bytes of the delta
encoding that was being sent in the interrupted response. It
therefore should send:
...
... case, is that the header field is ignored (by a
server that does not understand delta encoding).
Therefore, this is equivalent to the client's
...
... message-body
containing bytes 900 to 999 of the result of the
vcdiff encoding, with an "IM:vcdiff,range" response
...
... latter two cases (Tcur = "B" or Tcur = "C") and would not have been
able to apply the Range selection to the result of delta encoding.
On the other hand, suppose that the client ...
... server should send 304 (Not Modified), and if Tcur = "C", then the
server should send the entire new instance, either as a 200 response
or as a delta encoding against instance "A".
However, if Tcur = "B", in this case the server should first select
...
... the specified range (bytes 900 through the end) from both instances
"A" and "B", then compute the delta encoding between these ranges
(using vcdiff), and then transmit the result using a 226 (IM ...
... Encoding algorithms and formats ...
...
A number of delta encoding algorithms and formats have been described
in the literature:
...
... diff -e The UNIX "diff" program is ubiquitously available,
and is relatively fast for both encoding and decoding
(decoding is actually done using the "ed" program).
However, the size of the resulting deltas is
...
... deflate" [7, 6]) yields a more compact encoding, but
the costs of encoding and decoding are much higher
...
... 6]) yields a more compact encoding, but
the costs of encoding and decoding are much higher
than for "diff" by itself. This algorithm can only
...
... algorithm named "vdelta." (Note that the "vcdiff"
format can be used either for delta encoding or as a
compressed format, so two different instance-
manipulation values would have to be registered in
...
... algorithm-independent format for expressing deltas.
Because it is more generic it is easy to implement,
but it may not be the most compact encoding format.
Our proposal does not recommend any specific algorithm ...
... HTTP implementations.
We suspect that it should be possible to devise a delta encoding
algorithm appropriate for use on typical image ...
... potential [23], this may simply be because these experiments used
vdelta directly on the already-compressed forms of these encodings.
However, it might be necessary to devise a delta encoding algorithm ...
... vdelta directly on the already-compressed forms of these encodings.
However, it might be necessary to devise a delta encoding algorithm
that is aware of the two-dimensional nature of images ...
... IM" response-header field (one that includes the
actual delta-encoding format used in the response.) Of course,
such uses are subject to all of the other HTTP ...
... client's cache might be
corrupt, or the implementation of delta encoding (either at client or
server) might have a bug.
...
... manipulations, if any, that have been applied to the instance
represented by the response. Typical instance manipulations include
delta encoding and compression.
...
... header. (See section 10.10 for specific discussion of
combining delta encoding and multipart/byteranges.)
Responses that include an IM ...
...
This example indicates that the entity-body is a delta encoding of
the instance, using the vcdiff encoding.
...
...
This example indicates that the instance has first been delta-encoded
using the diffe encoding, then the result of that has been compressed
using deflate, and finally one or more ranges ...
... generating a delta-encoded response that the client can only
decode by first applying an instance-manipulation encoding to its
cached base instance. A server implementor might wish to consider
...
...
This example means that the client will accept a delta encoding in
either vcdiff or gdiff format.
...
...
This example means that the client will accept a delta encoding in
either vcdiff or gdiff format, but prefers the vcdiff format.
...
...
This example means that the client will accept a delta encoding in
either vcdiff or diffe format, and will accept the output of the
delta encoding ...
... encoding in
either vcdiff or diffe format, and will accept the output of the
delta encoding compressed with gzip. It also means that the client
...
... gzip compression of the instance, without any delta
encoding, because A-IM provides no way to insist that gzip be used
...
...
The use of delta encoding with content-encoded instances adds some
slight complexity. When a client (perhaps a proxy ...
... When a server generates a delta-encoded response, the list of
content-codings the server uses (i.e., the value of the response's
Content-Encoding header field) SHOULD be a prefix of the list of
...
... prefix of the list of
content-codings the server would have used had it not generated a
delta encoding.
This requirement ...
... forwards this result from a cache entry, the forwarded response
MUST carry the same Content-Encoding header field as the new
(delta) response (and so it must be content-encoded as
...
... A-IM headers,
because if the server does not support delta encoding, the client
would at least like to achieve the benefits of compression ...
... Use of compression with delta encoding ...
... deflate coding provides a more compact result.) However, this is not
a requirement for the use of delta encoding, primarily because the
CPU-time costs associated with compression ...
...
A client that supports both delta encoding and compression as
instance-manipulations signals this by, for example
...
... uses both instance-manipulations in the response, that compression be
applied to the result of the delta encoding, rather than vice versa.
I.e., the response in this case would include
...
... compression, either as
a content-coding or as an instance-manipulation, before delta
encoding. Remember that the entity tag is assigned after content-
...
... coding but before instance-manipulation, so this choice does affect
the semantics of delta encoding.
...
... Delta encoding and multipart/byteranges ...
... ranges.
If a server chooses to use a delta encoding for a
multipart/byteranges response, it MUST generate a response in
accordance with the following rules.
...
... for a delta-encoded response, it would never send "A-IM: vcdiff" (or
listing other delta encoding formats) for its unconditional requests.
The same study showed that at least 46% of the requests in lengthy
...
... GIF and JPEG). As noted in section 6,
we do not currently know of a delta-encoding format suitable for such
image types. Unless a client ...
... image types. Unless a client did support such a delta-encoding
format, it would presumably not ask for a delta when making a
conditional request for image ...
... also be included. However, none of these extra headers would be
included except in cases where a delta encoding is actually employed,
and the sender of the response can avoid sending a delta encoding ...
... encoding is actually employed,
and the sender of the response can avoid sending a delta encoding if
this results in a net increase in response size. Thus, a delta-
encoded response should never be larger than a regular response for
...
... the same request.
Simulations suggest that, when delta encoding pays off at all, it
saves several thousand bytes [23]. Thus, adding a few dozen bytes to
...
... probably optimize future responses. Neither of these headers is
necessary for the simpler uses of delta encoding.
...
...
We are not aware of any aspects of the basic delta encoding mechanism
that affect the existing security considerations for the HTTP/1.1 ...
...
Phong Vo has provided a great deal of guidance in the choice of delta
encoding algorithms and formats. Issac Goldstand and Mike Dahlin
provided a number of useful comments on the specification. Dave
...
... Jeffrey C. Mogul, Fred Douglis, Anja Feldmann, and Balachander Krishnamurthy. Potential benefits of delta encoding and data compression for HTTP. Research Report 97/4, DECWRL, July, 1997. ...
