Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Securely Propagating Auction Signals #119

Open
jeffkaufman opened this issue Feb 22, 2021 · 43 comments
Open

Securely Propagating Auction Signals #119

jeffkaufman opened this issue Feb 22, 2021 · 43 comments
Labels
FLEDGE Non-breaking Feature Request Feature request for functionality unlikely to break backwards compatibility

Comments

@jeffkaufman
Copy link
Contributor

We're considering integrating FLEDGE with a flow that looks like:

  1. The ad tag on the client sends a traditional contextual ad request.

  2. It receives a contextual ad response, and also auction signals generated on the server, such as the estimated likelihood of the slot being viewable.

  3. The tag invokes the FLEDGE APIs to run an interest group auction, providing those signals.

We are concerned that other scripts on the page could extract auction signals from the contextual ad responses. These signals are the outputs of complex proprietary models, and access to high-quality bidding signals is one reason buyers choose a particular sell-side platform.

Could FLEDGE provide a path for the server to provide auction signals to the browser auction, without making them accessible to other scripts on the publisher page? (We understand that this would not protect signals from headless browsers or manual inspection, and we think these vectors should be followed up separately.)

One way this could work would be if the worklet processing the decision logic began executing before the auction. The worklet could send a request to the ad server to fetch additional signals to append to auction_signals, which the server would restrict to the worklet's origin with an ACAO response header. If other scripts on the page attempted to read the response, they would be blocked by the same-origin policy.

Unfortunately this requires an additional round trip to fetch the signals. We could avoid that by having the contextual ad response be a web bundle, where the signals are provided in an opaque-origin resource, with an ACAO header limiting response access to the seller's origin. The ad tag could pass the opaque resource's URL to the decision logic worklet through auction_signals, and the signals fetch could be fulfilled from the bundle.

@michaelkleber
Copy link
Collaborator

Just to be sure I understand: your goal is for the seller to receive some opaque blob with the contextual response, which the seller's domain (e.g. its worklet) can open up and then pass along to the buyers in its auction? So you are trying to protect the contents of this blob from use outside the auction you're running, but are willing to allow free use within that auction?

This seems entirely reasonable as a goal. I'll need to work with some other Chrome engineers to figure out the technical details of how to support it, though.

@jeffkaufman
Copy link
Contributor Author

Yes, that's right!

@jeffkaufman
Copy link
Contributor Author

@michaelkleber This is still something we're interested in -- is this something you've been able to give more thought to?

I think this is probably something that could be built on top of Subresource Bundles.

@michaelkleber
Copy link
Collaborator

Hi @jeffkaufman : We have not been pursuing this approach. I presume, from your bumping this issue, that the use cases you're considering cannot be addressed instead by use of the seller's Key-Value server?

@JensenPaul if you have thoughts on how to consider this for future planning.

@jeffkaufman
Copy link
Contributor Author

Yes, we have signals that we don't want to expose to other scripts running in the publisher JS context, and these signals depend on the contextual ad request and so can't come from the K-V server.

@sbelov
Copy link

sbelov commented Jan 19, 2022

Being able to privately propagate auction signals to the FLEDGE auction also seems relevant in the context of discussions in #59 and #202 on how multiple sellers might be supported: different sellers who work with a given publisher and may have code running on that publisher’s pages might wish to keep their own auction signals private and not readable by other sellers. While a seller could invoke runAdAuction within an iframe, isolating their signals from anyone outside the iframe, iframe-based isolation does not seem possible with some of the multi-seller support proposals.

@JensenPaul
Copy link
Collaborator

When the contextual signals are returned (the ones that you’re asking to securely propagate), can they be exposed to JS momentarily, so they can be passed from, for example, the XHR result to a new API to convert them to an opaque blob?

Is the goal that one blob of signals are passed to all of a seller’s bidders?

@jeffkaufman
Copy link
Contributor Author

@JensenPaul if the signals are exposed momentarily, then other JS running on the page can read them, so I don't think that works? Let me write something up describing a few ideas for how to implement this and get back to you?

Is the goal that one blob of signals are passed to all of a seller’s bidders?

Some signals are for a seller's bidders (ex: "how likely is this slot to meet the ActiveView criteria") and those could go to all. Other signals are for the seller themself in scoring bids (ex: "how valuable is this slot on this particular page right now").

@jeffkaufman
Copy link
Contributor Author

jeffkaufman commented Jan 25, 2022

@JensenPaul Ok, here are four potential approaches to protecting server-generated contextual signals from scripts running on the publisher page:

  1. Run the auction inside a cross-domain iframe. The seller can either request the signals from within the iframe or with <iframe src="https://signals-url">. Unfortunately, this only works for the single-seller case. If you have multiple sellers (component auctions, Update explainer to include previously discussed multi-SSP mechanism #251) you have the problem that all of the signals, at some point, need to end up in the same JS context so they can be passed to runAdAuction.

  2. Use cryptography. The browser could provide a public key (per-site or per-pageview), and sellers could include that key on their requests for contextual signals. The key would be supplied in a header, to prevent an MITM attack where attacking JS substitutes a different public key. In their responses sellers could include signals encrypted against that key. Each seller would include these encrypted signals in their auction configuration, and the browser would decrypt them before making them available to the worklets. On the other hand, cryptography should not be necessary for this use case, since sellers just need some way to provide signals to the browser with a request that they only be exposed to their Turtledove worklets

  3. Use additional network requests. The API to initiate an auction could be extended to allow each seller to provide some signals by URL. The browser would fetch these URLs and make the results available to the seller's worklets, with a request header like Sec-Fetch-Dest: turtledove so the seller would know that their responses would not be accessible to non-Turtledove readers. This does add latency, however, with the signals fetch requiring a round trip to the server before the auction can begin.

  4. Use subresource bundles, as in WebBundles for Ad Serving webpackage#624. This is an extension of (3) that fixes the latency issue. Each seller would format their contextual response as a web bundle, which would include both their contextual ads and opaque turtledove signals. Each component would be identified by a distinct uuid-in-package: URL. They would pass the URL to the turtledove signals into the auction, which would proceed as in (3).

Since we want a solution that handles component auctions and performs well, I think the strongest options are (2) and (4). Of these, since (2) requires the browser to generate public keys and implement a new cryptographic protocol, that pushes strongly in favor of (4).

Aside, on proxy-based attacks: every approach here is somewhat vulnerable to a hostile seller running JS on the page. If the attacker is willing to run a proxy and impersonate the browser they can override the API to create the iframe, the API used to call the ad server, or the runAdAuction API, substituting their own URLs that route through the proxy. Running a proxy attack at any appreciable scale, however, would be highly visible, and the sellers' existing anti-fraud systems would be able to detect and respond. Since the issue here is routine signal leakage, let's set aside these attacks as out of scope.

@JensenPaul
Copy link
Collaborator

if the signals are exposed momentarily, then other JS running on the page can read them, so I don't think that works?

If the fetch of the signals happened in an iframe, would that secure them from other scripts on the page? Perhaps the browser could offer something akin to postMessage() to securely feed the signals from the iframe into a FLEDGE auction bidder or seller?

@jeffkaufman
Copy link
Contributor Author

jeffkaufman commented Jan 28, 2022

I think something like that could work, though it does add additional latency relative to (4). I think this would require adding something to the auction config saying that additional signals are expected, so the browser knows to delay starting the auction until the signals arrive? Spitballing an API, the publisher page could call:

navigator.runAdAuction({
  ...
  asyncSignalsToken = "random token",
});

Then whatever iframe the seller configures can run:

navigator.provideAuctionSignals("random token", {
  extraSellerSignals: {...},
  extraPerBuyerSignals: {
     "dsp1 origin": {...},
     "dsp2 origin": {...},
  },
});

These can be called in either order: if runAdAuction goes first it waits for provideAuctionSignals before starting, and vice versa.

One thing I like this API is that it supports a (4)-like flow where the signals are returned as an html resource within a webbundled contextual ad response, minimizing the latency impact.

@JensenPaul
Copy link
Collaborator

Can you describe where the additional latency concern comes from? is this due to the iframe requirement?

Could we simplify your API by having the "random token" get returned from provideAuctionSignals() rather than passed in? This precludes runAdAuction() being called first, but I think it makes it simpler and more straightforward to use and implement.

@jeffkaufman
Copy link
Contributor Author

Can you describe where the additional latency concern comes from? is this due to the iframe requirement?

Yes: creating an iframe and waiting for it to run JS is going to add latency.

Could we simplify your API by having the "random token" get returned from provideAuctionSignals() rather than passed in?

That would work, but it would add even more latency because of the postMessage requirement. It would require a flow like:

  1. Page creates iframe
  2. iframe calls provideAuctionSignals and then postMessage with the token
  3. Page receives token, calls runAdAuction

@caraitto
Copy link
Collaborator

Subresource bundles are now in origin trial (M90-M101) in Chrome.

Perhaps a hybrid postMessage / subresource bundles approach might make sense? Basically, if subresource bundles are available (this would be a runtime check during subresource bundles OT), we allow subresource bundle UUIDs in provideAuctionSignals() parameters:

navigator.provideAuctionSignals("random token", {
  extraSellerSignals: {...},  // Or "[Bundle UUID]"
  extraPerBuyerSignals: {
     "dsp1 origin": {...},
     "dsp2 origin": "Bundle UUID",
  },
});

The bundle UUID would resolve to a JSON resource -- the worklet doesn't know or care if the signals came from a bundle or from a JS object. The same behavior of delaying the auction until all signals have been received would apply.

If / when subresource bundles become standardized, I think we wouldn't need provideAuctionSignals() -- we could just provide the UUIDs to runAdAuction(), since the contents of the web bundle shouldn't be loaded into the renderer process, so the cross-origin iframe approach wouldn't be necessary for isolation. But, I think the hybrid approach allows taking advantage of the benefits of subresource bundles when available without requiring everyone to migrate. (Although, IIUC adopting subresource bundles for this purpose doesn't seem too difficult, assuming it's available).

Of course, this approach has more complexity (needing to deal with both subresource bundles and delaying the auction), which would be good to avoid if the benefits aren't necessary.

@caraitto
Copy link
Collaborator

caraitto commented Apr 25, 2022

Alternatively, another hybrid approach would be to allow extraSellerSignals / extraPerBuyerSignals to be passed to runAdAuction(), but with UUID values. provideAuctionSignals() would still exist, but it'd accept JSON-serializable objects. Then, we could make it an error to specify both asyncSignalsToken and one or more of extraSellerSignals / extraSellerSignals.

@caraitto
Copy link
Collaborator

caraitto commented Apr 28, 2022

I'm going to try prototyping the subresource bundle portion of the hybrid approach above (extraSellerSignals / extraPerBuyerSignals passed to runAdAuction()). I can follow up with implementing the provideAuctionSignals() side if there's interest.

@caraitto
Copy link
Collaborator

caraitto commented May 3, 2022

@jeffkaufman A minor semantic clarification: I think the ACAO response header from CORS doesn't have the ability to restrict access -- on the contrary, it allows access to a resource like application/json files / subresources that have already been restricted by the same origin policy (and also CORB; CORB will prevent the cross-site JSON from being sent to the renderer process, even if a request is made via fetch(), <script>, etc.). My understanding is that the origin of the subresources will be treated as the same as the origin of the wbn package containing them.

The important thing then from the Chromium side is that the request for the subresources should be made as if they were from the seller's origin, and not the frame's origin (which doesn't have to match the seller's).

(Mostly clarifying my understanding here -- you're likely already familiar with this situation).

@caraitto
Copy link
Collaborator

caraitto commented May 4, 2022

Another question concerns observation of signals by extensions -- I'm assuming this isn't in the threat model for this feature?

@caraitto
Copy link
Collaborator

caraitto commented May 4, 2022

@jeffkaufman Also, out of curiosity, is the use of the uuid-in-package scheme (vs. subresources with a https scheme) required to achieve the desired isolation of the signals? IIUC, I think both schemes could work -- is there something important about the use of the opaque origins of uuid-in-package resources?

@jeffkaufman
Copy link
Contributor Author

Another question concerns observation of signals by extensions -- I'm assuming this isn't in the threat model for this feature?

Yes, not trying to hide things from users

@jeffkaufman
Copy link
Contributor Author

I think you're right that UUID resources are not necessary for this. In WICG/webpackage#624 they are necessary to get the iframe onto a unique origin, but that isn't a consideration here. I think I probably included them in my comments above because I was thinking too much about bundled as loading when I wrote them.

@caraitto
Copy link
Collaborator

Do the requests for signals / bundles need cookies / credentials?

@caraitto
Copy link
Collaborator

@jeffkaufman FYI, for web bundle requests, I think the server will need to check that the request for the bundle file was made with Sec-Fetch-Dest: webbundle -- this will only be set if the bundle is fetched as part of a <script type="webbundle"> tag.

For instance, a fetch("path/to/bundle.wbn") call will set Set-Fetch-Dest to the empty string.

This will prevent the page from fetching and parsing the bundle contents itself (in JavaScript, or server-side).

@caraitto
Copy link
Collaborator

caraitto commented Jun 1, 2022

@jeffkaufman

If secure signals fail to load, should the auction fail? Or is it acceptable to continue the auction without the secure signals?

@jeffkaufman
Copy link
Contributor Author

Sorry for missing your questions earlier!

Do the requests for signals / bundles need cookies / credentials?

Yes, while third party cookies are still a thing we'll need to be sending them for backwards compatibility.

I think the server will need to check that the request for the bundle file was made with Sec-Fetch-Dest: webbundle -- this will only be set if the bundle is fetched as part of a <script type="webbundle"> tag.

That sounds good!

If secure signals fail to load, should the auction fail? Or is it acceptable to continue the auction without the secure signals?

In our expected usage the auction would either need to be aborted by the browser or our seller JS would abort it, because some critical signals would be sent this way. But possibly other users of this API would be using it for non-critical signals only?

@caraitto
Copy link
Collaborator

caraitto commented Jun 6, 2022

Thanks for the responses!

If secure signals fail to load, should the auction fail? Or is it acceptable to continue the auction without the secure signals?

In our expected usage the auction would either need to be aborted by the browser or our seller JS would abort it, because some critical signals would be sent this way. But possibly other users of this API would be using it for non-critical signals only?

Got it -- I think if there's a desire to have non-critical signals, we could always add a new AuctionConfig field for those, so that the critical and non-critical signals can be separated.

@patmmccann
Copy link
Contributor

In our expected usage the auction would either need to be aborted by the browser or our seller JS would abort it, because some critical signals would be sent this way.

I'm a bit confused here. The only critical signal seems to be the price of the GAM contextual winner, please correct me if that's not the critical signal you refer to. If Gam/AdX's signals fail, why wouldn't the auction proceed with other sellers? If Gam/Adx contextual doesn't fill, why would you have a critical signal at all? Your example of modeled viewability prediction doesn't seem to be critical.

@jeffkaufman
Copy link
Contributor Author

I'm no longer at Google, but what I was trying to say was that if the signals can't be transferred into the Turtledove auction we would not want to run an auction. The plan, as far as I understood before leaving, was that the critical signals would always be sent to the client, in the same web bundle as the contextual response.

I think this is likely not a real issue, though: the only situation in which the signals wouldn't be available in the auction despite them being sent to the client would be a browser error, and it doesn't seem a very likely browser error?

caraitto added a commit to caraitto/turtledove that referenced this issue Jul 11, 2022
As requested in WICG#119, auctionConfig.extraSignals provides a mechanism
that sellers can use to offer signals to auction participants, similar
to the existing `auctionSignals` and `sellerSignals`, but accepting a
subresource bundle URL instead of a JSON-serializable JavaScript object.

By passing the auction signals this way (and by setting the
X-FLEDGE-Auction-Only header on the response), scripts running on the
publisher page won't be able to read the signals -- only the intended
auction worklets will have access.
@caraitto
Copy link
Collaborator

@zhengweiwithoutthei-zz (who I understand is taking this over from Jeff Kaufman).

With the current (non-securely propagated) auction signals, the publisher page calls runAdAuction(), and therefore if the publisher is a different entity than the seller, it doesn't have to respect the seller's wishes with respect to which signals get passed to which buyers (publisher.com, or a script running on it's page, could make buyer1.com's signals get passed to buyer2.com, for instance).

Am I correct in that we don't want this behavior for the securly-propagated auction signals?

@jeffkaufman
Copy link
Contributor Author

@zhengweiwithoutthei actually tag Zheng

@caraitto
Copy link
Collaborator

With respect to my prior comment, the current API (see #325) does ensure that the caller of runAdAuction() cannot make buyer1.com's signals get sent to buyer2.com, since the ?perBuyerSignals=buyer1.com bundle subresource URL suffix (which is served by, and therefore controlled by the seller's domain) can only be delivered to buyer1.com.

@zhengweiwithoutthei
Copy link

@caraitto when do you think #325 will be available for testing? I can see a change in progress https://chromium-review.googlesource.com/c/chromium/src/+/3615057

@zhengweiwithoutthei
Copy link

@caraitto Question: how does the current spec (#325) support multiple auctions in the document? The signals should be adslot/auction specific. It doesn't seems to be a way to register the the subresources for each auction.

Can we pack each auction's extra signals as one object and key it with an uuid and pass that uuid as directFromSellerSignals?

@caraitto
Copy link
Collaborator

@zhengweiwithoutthei Sorry I missed your question earlier.

when do you think #325 will be available for testing?

I have a working end-to-end test, and I'm in the process of breaking off reviewable pieces of that CL to land. I hope to be done in the next few weeks -- after that, the feature should be testable in Canary, and roll out to the other channels.

how does the current spec (#325) support multiple auctions in the document? The signals should be adslot/auction specific. It doesn't seems to be a way to register the the subresources for each auction.

Each runAdAuction() call can be passed a different prefix. So, for example, if you have ad slots 1 and 2 on a page, you could use these prefixes (these are arbitrary -- they just need to be different from each other):

https://seller.com/adSlot/1/signals
https://seller.com/adSlot/2/signals
etc.

This would get expanded using the pre-defined suffixes into the following subresource URLs, not all of which need be present in bundle file(s) (DirectFromSellerSignals doesn't care which subresource comes from which bundle -- they could all come from the same bundle file, as long as they come from the seller's origin):

https://seller.com/adSlot/1/signals?sellerSignals
https://seller.com/adSlot/1/signals?auctionSignals
https://seller.com/adSlot/1/signals?perBuyerSignals=https://buyer1.com
https://seller.com/adSlot/1/signals?perBuyerSignals=https://buyer2.com
https://seller.com/adSlot/1/signals?perBuyerSignals=[... etc.]

https://seller.com/adSlot/2/signals?sellerSignals
https://seller.com/adSlot/2/signals?auctionSignals
https://seller.com/adSlot/2/signals?perBuyerSignals=https://buyer1.com
https://seller.com/adSlot/2/signals?perBuyerSignals=https://buyer2.com
https://seller.com/adSlot/2/signals?perBuyerSignals=[... etc.]

etc.

Note that UUID URLs (with the uuid-in-package scheme) are not supported, as they don't support CORS, as mentioned in #325. (They also wouldn't work well with the prefix / suffix system). Instead, subresource use URLs that look like ordinary https URLs.

@zhengweiwithoutthei
Copy link

@caraitto Thanks! This is useful. Should we clarify that the prefix can be slot specific in the explainer?

@caraitto
Copy link
Collaborator

@zhengweiwithoutthei Sure, I mailed #363 to do that.

@caraitto
Copy link
Collaborator

caraitto commented Oct 7, 2022

Question for @zhengweiwithoutthei, @patmmccann and anyone considering using this feature:

With respect to failing the auction if the signals fail to load -- would this just happen for sellerSignals / auctionSignals load failures (since these are delivered to the seller's worklet), or would it also apply to failing to load perBuyerSignals for a given buyer (since these are only delivered to bidder worklets)?

In other words, if the perBuyerSignals for a single buyer fail to load, should we fail the whole auction?

I'm considering changing the explainer to just have the signals be passed to worklet functions as null if they fail to load -- the seller worklet can decide to fail if it doesn't receive DirectFromSellerSignals sellerSignals / auctionSignals, but it was expecting them. Likewise, buyers could choose to bid or not bid in the event it didn't receive the perBuyerSignals or auctionSignals that were expected.

This simplifies the implementation somewhat, but it also allows sellers and buyers the ability to decide what to do. However, with the design as-is, the seller worklet has no insight into whether or not the bidders were able to load their perBuyerSignals, and the auction will proceed even if some but not all bidders successfully load their perBuyerSignals.

I'm wondering it that's acceptable or not?

(FWIW, I don't think these errors are very likely most of the time -- they probably indicate that required headers are missing, JSON is invalid, or perhaps that the download of the bundle file failed, for instance, due to the seller's server being down).

@zhengweiwithoutthei
Copy link

I like the idea of passing null to the worklet instead of failing the auction if the corresponding signal fails to load. The seller/buyers script can decide what to do in this situation.

@zhengweiwithoutthei
Copy link

(4) requires ad tech to adopt a new technology - WebBundle - which comes with a cost and could negatively impact the adoptability of the FLEDGE API. All we need here is a pathway for the seller to provide signals to the auction and make them only accessible by the worklet. Is it possible for FLEDGE to introduce a new header which can be appended on the contextual ad response to securely transfer data. The signal can be trusted from the seller because it is returned on a response from the seller's origin. The header can be something like:

X-FLEDGE-Secure-Signals={‘adSlot/1’, {
  "sellerSignals": {/*…*/},
  "auctionSignals": {/*…*/},
  "perBuyerSignals": {"https://buyer1.example": /*…*/}
})

Chrome can make it so that:

  1. The signal included in this header cannot be read by scripts on the publisher’s page.

  2. When Chrome sees this header, stores it for the auctions later.

One possible way for (2) is for Chrome to register the content as a subresource bundle under the hood with required content type and response headers. The URLs will be registered as seller’s origin/adslot_id + pre-defined suffixes. In this case, there is no change to the current spec for directFromSellerSignals. Caller will still need to pass the prefix to runAdAuction() but no need to fetch or register the bundle.

This should make multi-seller support easier too.
wdyt?

@caraitto
Copy link
Collaborator

I mailed #376 to pass null and not fail the auction -- anyone interested in this feature should feel free to leave feedback in that PR, or this thread.

Zheng, I'll respond to your proposal in a follow-up comment.

@caraitto
Copy link
Collaborator

caraitto commented Dec 1, 2022

@zhengweiwithoutthei @patmmccann FYI, DirectFromSellerSignals is now ready for testing in the latest Canary (version 110.0.5451.0 or later). It will rollout to Chrome beta and stable in version 110.

@zhengweiwithoutthei
Copy link

@caraitto Thanks! We are able to verify that DirectFromSellerSignals WAI.
By the way, please take a look at #423

@patmmccann
Copy link
Contributor

patmmccann commented May 24, 2023

@michaelkleber as reference for our discussion on Thursday

You said in #418

Solving this problem is absolutely in line with the Privacy Sandbox goals: user privacy should be preserved, and website owners should have transparency and control over what third-parties on their pages can do.

As @MattMenke2 aptly summarizes in the merge of #325 ; the purpose of this feature is the opposite, with the publisher the threat and @jeffkaufman dictating requirements to the Chrome team with the clear goal of preserving AdX's advantage over other sellers by not exposing price of the contextual winner to publishers at event time.

This lack of exposure is the only reason publishers submit other ssp bids to GAM, and GAM now subjects those bids to a secret floor, suppressing them without sharing that floor, and ensuring adx never sees a higher floor than another seller or misses an impression opportunity.

This lack of exposure also allows the GAM team to extend their market position into the fledge auction. By not exposing the price of the contextual auction to publishers, publishers will have to use GAM to call runadauction if they want to use GAM at all, a choice they don't have because AdWords won't transact elsewhere. The more natural workflow would simply be the contextual win price is exposed to the publisher who then calls runadauction if they so choose and conducts a top level auction themselves.

This would allow publishers to gather demand from whichever sources they choose without passing it all to GAM. In a strange irony, in protecting adx from leaking its price to publishers, you may not realize how adx requires all ATP to leak their prices to it and only this feature preserves that into the sandbox.

In building this particular feature, you appear to have cemented the position of GAM forever and violated the principles as you laid them out. I think you should revert it.

I look forward to our discussion tomorrow.

@JensenPaul JensenPaul added the Non-breaking Feature Request Feature request for functionality unlikely to break backwards compatibility label Jun 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
FLEDGE Non-breaking Feature Request Feature request for functionality unlikely to break backwards compatibility
Projects
None yet
Development

No branches or pull requests

7 participants