Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cross Origin pthread worker #21937

Open
msqr1 opened this issue May 14, 2024 · 33 comments
Open

Cross Origin pthread worker #21937

msqr1 opened this issue May 14, 2024 · 33 comments

Comments

@msqr1
Copy link

msqr1 commented May 14, 2024

I love the new inline worker change, but now we can't never use a 3rd-party CDN to serve those files again, because Worker construction requires the file to be in the same origin. Previously, we can just host the very small worker file on our server, while serving the heavy main file through a CDN, but now we can't anymore. Is there any way we could address this problem?

@sbc100
Copy link
Collaborator

sbc100 commented May 14, 2024

Can you explain why this doesn't work? Why cant you serve the main JS file from the CDN?

We do have setting called mainScriptUrlOrBlob which controls where the main JS file finds itself. I'm curious why this would be needed though since I would expect that loading JS file from the same location from which the main thread already loaded it would make the most sense and result in the best caching behaviour. In fact, shouldn't the browser already have the main JS file cached, resulting in no extra JS fetches per-thread?

@msqr1
Copy link
Author

msqr1 commented May 14, 2024

I know it is cached, but it is still from cross origin. The thing that matter is the origin, not cached or not.

@sbc100
Copy link
Collaborator

sbc100 commented May 14, 2024

Sorry I didn't read the title there.

I'm not too familiar with cross-origin limitations. If you, or anyone else, has any ideas on how to make this work better that would be most welcome.

@sbc100
Copy link
Collaborator

sbc100 commented May 14, 2024

So, to be clear, the security policy disassllows new Worker with a cross origin script but then allows the new worker to call importScripts from other origins?

@sbc100
Copy link
Collaborator

sbc100 commented May 14, 2024

Trying to understand the issue, the only mention I see about same-origin restrictions in workers looks like it related to subworkers:
https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Using_web_workers. Are you using subworkers? Or is this some kind of additional non-default policy that you have setup?

@sbc100
Copy link
Collaborator

sbc100 commented May 14, 2024

Hmm ... I wonder if we could do something line new Worker(blobUrl("importScripts(...)" to work around this?

Maybe you could even try this out by adding something like this to your pre-js file:

Module['mainScriptUrlOrBlob'] = `importScripts(${document.currentScript})`;

?

@sbc100
Copy link
Collaborator

sbc100 commented May 14, 2024

I tried this locally and the following --pre-js seemed to work:

if (!ENVIRONMENT_IS_PTHREAD) {
  Module['mainScriptUrlOrBlob'] = new Blob([`importScripts('${document.currentScript.src}');`])
}

@msqr1
Copy link
Author

msqr1 commented May 14, 2024

Did you try to serve that file through a cdn?

@sbc100
Copy link
Collaborator

sbc100 commented May 14, 2024

Nope, I'm just going by the fact that importScripts from your CDN worked for you in the past. So this is way to effectively revert to the prior behaviour.

@msqr1
Copy link
Author

msqr1 commented May 14, 2024

Yes, importscript in the worker seems to work fine. We should integrate this fix because I'm sure many people will use a cdn

@sbc100
Copy link
Collaborator

sbc100 commented May 14, 2024

I'm confused by https://developer.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Using_web_workers though. It seems to suggest that only sub-workers have this requirement by default? So by default, using a CDN like this should work?

@sbc100
Copy link
Collaborator

sbc100 commented May 14, 2024

Can you confirm that the --pre-js fix works for you your use case?

@msqr1
Copy link
Author

msqr1 commented May 14, 2024

I know it's weird, but see this: https://stackoverflow.com/questions/58098143/why-are-cross-origin-workers-blocked-and-why-is-the-workaround-ok

Basically the default mode for worker URL fetching is same-origin, so a cross origin URL will fail. Importscript default is just a normal fetch, so it works

@sbc100
Copy link
Collaborator

sbc100 commented May 14, 2024

So it sounds like maybe MDN is just wrong on this a not up-to-date with the spec, since it only mentions the same-origin requirement for sub-workers.

@sbc100
Copy link
Collaborator

sbc100 commented May 14, 2024

@msqr1
Copy link
Author

msqr1 commented May 14, 2024

Yeah it does, probably should report this to them.

@msqr1
Copy link
Author

msqr1 commented May 14, 2024

Can you confirm that the --pre-js fix works for you your use case?

It doesn't, I'm using modularize, so in the module callback document.currentScript is null. I used the _scriptName global variable that keeps a reference to document.currentScript.src instead. It works. Normally, without modularize, it is available, but with that on, it won't.

@sbc100
Copy link
Collaborator

sbc100 commented May 14, 2024

Can you share the full fix you came up with here?

@msqr1
Copy link
Author

msqr1 commented May 14, 2024

I took yours and replace document.currentScript.src with _scriptName for modularize mode, not sure about non-modularization

if (!ENVIRONMENT_IS_PTHREAD) {
  Module['mainScriptUrlOrBlob'] = new Blob([`importScripts('${_scriptName}');`])
}

@sbc100
Copy link
Collaborator

sbc100 commented May 14, 2024

Does that work? Looking at the code it seems that _scriptName is set only after the --pre-js code is included.

@sbc100
Copy link
Collaborator

sbc100 commented May 14, 2024

(We could potentially move the _scriptName assignment earlier).

@msqr1
Copy link
Author

msqr1 commented May 14, 2024

Wait, I thought the _scriptName is declared before the iife is returned to be used in the function?

@sbc100
Copy link
Collaborator

sbc100 commented May 14, 2024

Wait, I thought the _scriptName is declared before the iife is returned to be used in the function?

I think MODULARIZE mode differs in this respect and injects _scriptName very early on.

@msqr1
Copy link
Author

msqr1 commented May 14, 2024

PR?

@sbc100
Copy link
Collaborator

sbc100 commented May 14, 2024

I'd like to better understand the problem, and decide if we should do this default or make it a new option. I also wonder if there is any downside to doing in the case when things are hosted on the origin. For example, does it make debugging hard / more opaque when the URL passed to new Workers is a blob that then called importScripts vs just a URL?

For now, is it OK if you use your workaround?

@msqr1
Copy link
Author

msqr1 commented May 14, 2024

I'm happy with my fix, but I think this should be the default if we decide to go with it. People wont expect this bug when they switch to a new version that keeps them from using a CDN. Also:
image
We can move capturing it to before the pre.js if possible

@msqr1
Copy link
Author

msqr1 commented May 15, 2024

To be fair MDN kinda said it here: https://developer.mozilla.org/en-US/docs/Web/API/Worker/Worker that the URL should be same origin as the host

One more thing, if we decide to go with this, don't forget to set Blob's MIME type to text/javascript, otherwise it won't work on Firefox: https://developer.mozilla.org/en-US/docs/Web/API/Worker/Worker#:~:text=Strict%20MIME%20type%20checks%20for%20worker%20scripts

@sbc100
Copy link
Collaborator

sbc100 commented May 15, 2024

@msqr1 I'm curious, are there some CDNs (e.g couldflare and fastly type things) that don't require you to do cross-origin stuff? i.e. they take over your whole domain, so to speak? I could be misunderstanding but I wonder how popular those two different types of solution are?

@msqr1
Copy link
Author

msqr1 commented May 15, 2024

There is no such CDN service that I know of. But I think Website hosting service does exactly what you say, they have servers everywhere that basically acts as CDN. Those servers serve your main page, your assets and everything else you have. You can basically treat it as your own domain while still utilizing their "CDN" to serve files fast. I don't think it would make sense for a CDN to take over your domain, because what CDN does is it helps you minimize distance by utilizing their numerous servers available around the globe (because you only have one, a small, or a few server).

@msqr1
Copy link
Author

msqr1 commented May 23, 2024

I also wonder if there is any downside to doing in the case when things are hosted on the origin.

@sbc100, CSP needs to be set for worker-src or default-src to blob for the workaround to spawn the worker. But at this point, why don't we just spawn the whole worker content straight from the blob already? Why importScript if we're using blob anyway?

In my opinion, if we choose to do this, I would make a new setting like CROSS_ORIGIN_PTHREAD that uses the new approach of spawning from the blob.

@sbc100
Copy link
Collaborator

sbc100 commented May 23, 2024

I also wonder if there is any downside to doing in the case when things are hosted on the origin.

@sbc100, CSP needs to be set for worker-src or default-src to blob for the workaround to spawn the worker. But at this point, why don't we just spawn the whole worker content straight from the blob already? Why importScript if we're using blob anyway?

In my opinion, if we choose to do this, I would make a new setting like CROSS_ORIGIN_PTHREAD that uses the new approach of spawning from the blob.

It seems reasonable to add a new option for this yes. Feel tree to send a patch. BTW, I would call the option PTHREAD_CROSS_ORIGIN so that is groups with the other PTHREAD_-prefixed settings.

I'm also curious to see what proportion of folks are affected by this.

@sbc100
Copy link
Collaborator

sbc100 commented May 23, 2024

If the workaround requires worker-src or default-src already then why not use the importScript method? One advantage to this is debug-ability since the worker then knows the source of its source, as it were. Also, I can imagine that the importScripts methods is much for friendly to the browser cache, and maybe even the v8 code case, and could be faster for that reason.

@msqr1
Copy link
Author

msqr1 commented May 23, 2024

You're right, I forgot about the cache

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants