Trusted Server Caching and Validation #158

jeffkaufman · 2021-03-24T00:33:20Z

The FLEDGE trusted server call shouldn't use standard HTTP caching semantics because it is logically multiple calls bundled together for efficiency. For example, say the browser wants ["https://www.kv-server.example", "publisher.com", ["key1", "key2"]] and does:

GET https://www.kv-server.example/?hostname=publisher.com&keys=key1,key2

It might receive back something like:

Cache-Control: max-age=3600
...
{"key1": ..., "key2": ...}

If a minute later, within the max-age=3600, it wants ["https://www.kv-server.example", "publisher.com", ["key2"]], it should be able to satisfy that from cache, but current HTTP caching semantics mean it won't know to look at the cache response for the earlier request. Similarly, if wanted ["https://www.kv-server.example", "publisher.com", ["key1", "key2", "key3"]] it should be able to send a request only for key3. One way to handle this would be a FLEDGE-specific cache for these key-value pairs.

Since responses may be large, it would be good to be able to handle revalidation requests, allowing the trusted server to save bytes on keys that are already up-to-date in cache. The intended cache status is at the key level, instead of the request level, so the browser can't use request-level If-None-Match or If-Modified-Since headers.

Could the request consist of a list of key-validator pairs, allowing the server to omit any keys whose values haven't changed? For example:

[
  ["key1", ["If-None-Match", "<hash of previous value1>"]],
  ["key2", ["If-None-Match", "<hash of previous value2>"]],
  ["key3", []], // no previous value
  …
]

Alternatively, it's possible that with HTTP/2+ the cost of sending one request per key (https://www.kv-server.example/?hostname=publisher.com&key=key1) would be low enough that we don't need any batching? Our guess is that this is not the case, but it's an empirical question. That would also make the browser implementation cleaner, since these calls would no longer require special cache treatment.

The text was updated successfully, but these errors were encountered:

jeffkaufman · 2021-12-10T18:43:55Z

Now that Chrome is farther along in the implementation, I think this may be worth revisiting, especially the first half of the proposal around a FLEDGE-specific cache. In the current implementation, the trustedScoringSignals response is never cached while the trustedBiddingSignals response is only cached if the URL matches exactly (as you would expect from the standard cache semantics). An uncacheable or poorly cached round trip that blocks the auction is quite bad from a latency perspective, so it would be nice if we could do something better here.

rdgordon-index · 2024-02-11T17:48:59Z

In the current implementation, the trustedScoringSignals response is never cached

@MattMenke2 -- can you amend #906 to cover this as well? There's mention of a global HTTP cache, but only in the context of bidder worklets, not seller worklets -- and I think it's crucial to clarify if this is a still a privacy concern or not with the current implementation.

MattMenke2 · 2024-02-12T02:02:31Z

I don't think this is worth writing up at the moment - we just use standard HTTP caching semantics - I think we may use a transient network partition for seller signals, because we can't really leak the URL to any network partition without essentially leaking cross-origin cookie-equivalents to whatever network partition we use, while we currently use the bidder's 1P partition for bidder signals (which is also leaky).

We'll need to move over to something completely different once we use an actual trusted server for these fetches. Since OHTTP doesn't work with HTTP caching semantics, we'll need our own cache at that point (probably short lived an in memory, but we'll see). We may wire up that cache in place of the current HTTP caching semantics + network partitioning scheme for our current request format (and switch to a more privacy-preserving partitioning approach as well - e.g., we could use the network partition used to join the interest group in the first place, though that would potentially mean more network connections/requests).

Anyhow, a lot to work out here. I don't think it's worth documenting how things currently work here, as it's likely going to change pretty drastically at some point in (hopefully) the fairly immediate future, though any rollout of new behavior here will likely be slow, to compare its performance with the current behavior.

rdgordon-index · 2024-02-12T16:21:46Z

we just use standard HTTP caching semantics

So, just to be explicit, is the following statement above no longer true re: TSS being "never cached"?

In the current implementation, the trustedScoringSignals response is never cached while the trustedBiddingSignals response is only cached if the URL matches exactly

MattMenke2 · 2024-02-12T16:35:37Z

With network partitioning, all network requests need to be associated with a network partition. With HTTP caching, any page that shares a network partition with trusted signals can probe the cache for responses, to try to see what bids a page made. This is a pretty serious violation of both FLEDGE's user-tracking model and, more fundamentally, the cross-origin attack model of the web, since it potentially exposes cookie-equivalents to third parties. If we use the publisher's network partition, we expose ad URLs for bids to the publisher page and everything in it. If we use the seller partition, we expose what ads bidders wanted to show to the user to the seller (and any 3P scripts they run, if the user navigates to the seller's origin), which is also not great.

So I think we currently act as if seller requests came from an opaque origin. I'm not a cache expert, but I think our cache may not cache anything for network partitions associated with opaque origins, currently, so I think the never caching may well actually be true. If we did cache in those cases, auctions run at the same time from the same page could actually be cached (All requests for the lifetime of an internal SellerWorklet object share a gobally unique key for their opaque origin, until we tear down the seller worklet).

For bidder fetches, we do use the bidder's network partition, which has the same 3P script leaking funkiness as the seller's origin, so we do probably want to improve things there, too, but caching is also likely more useful there, hence having the temporary leak until we have a better caching strategy.

morlovich · 2024-02-12T16:47:52Z

re: opaque origin --- if this is holding the NIK right:
https://source.chromium.org/chromium/chromium/src/+/main:net/http/http_cache_transaction.cc;drc=35406c8d4b7301ede262aeedfc6a63e5e3cf555d;l=2655

JensenPaul added the Non-breaking Feature Request Feature request for functionality unlikely to break backwards compatibility label Jun 23, 2023

rdgordon-index mentioned this issue Feb 4, 2024

Batch Requests to DSP Trusted Server #767

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trusted Server Caching and Validation #158

Trusted Server Caching and Validation #158

jeffkaufman commented Mar 24, 2021 •

edited

jeffkaufman commented Dec 10, 2021

rdgordon-index commented Feb 11, 2024

MattMenke2 commented Feb 12, 2024

rdgordon-index commented Feb 12, 2024

MattMenke2 commented Feb 12, 2024

morlovich commented Feb 12, 2024

Trusted Server Caching and Validation #158

Trusted Server Caching and Validation #158

Comments

jeffkaufman commented Mar 24, 2021 • edited

jeffkaufman commented Dec 10, 2021

rdgordon-index commented Feb 11, 2024

MattMenke2 commented Feb 12, 2024

rdgordon-index commented Feb 12, 2024

MattMenke2 commented Feb 12, 2024

morlovich commented Feb 12, 2024

jeffkaufman commented Mar 24, 2021 •

edited