Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

worker thread expose its tid #30061

Closed
abbshr opened this issue Oct 22, 2019 · 14 comments
Closed

worker thread expose its tid #30061

abbshr opened this issue Oct 22, 2019 · 14 comments
Labels
feature request Issues that request new features to be added to Node.js. stale worker Issues and PRs related to Worker support.

Comments

@abbshr
Copy link
Contributor

abbshr commented Oct 22, 2019

Is your feature request related to a problem? Please describe.

I write program under linux, and I want to collect metrics (like cpu times, ctx switch, memory usage, ...) of the specified worker thread by the tid.

It seems impossible to get the statistics information provided by the os of the given thread by using a worker.threadId, and it is only make sense within the process.

Maybe we could expose a new property to return the tid assigned by os, such as worker.tid?

Then, we could hold the tid to get more stats under /proc or do something else meaningful.

Describe the solution you'd like

Make a system call (224) under linux is enough.

#include <unistd.h>
#include <sys/syscall.h>

void *worker_thread()
{
  int tid = syscall(__NR_gettid);
}

or pthread_self() (but this return a address, need some trick to pick the os level tid)

@gireeshpunathil gireeshpunathil added the feature request Issues that request new features to be added to Node.js. label Oct 22, 2019
@gireeshpunathil
Copy link
Member

cc @addaleax @nodejs/workers . I think an id was preferred over the real tid for a reason, but I can't remember.

@addaleax
Copy link
Member

I think the issue description already gives a meaningful reason why this is tricky -- whatever we do in this direction is going to be platform-dependent and maybe a bit hacky. Using gettid() would give a meaningful result on Linux, but what do we do for other platforms?

I wouldn't be against introducing something like a thread ID field that means something to the OS, but figuring out what this means outside of Linux and what one could do with that ID elsewhere would be a good start.

@bnoordhuis
Copy link
Member

Aside, this seems like a problematic concept if/when thread migration is introduced. Right now there's a 1-to-1 correspondence between workers and their OS threads but there doesn't have to be.

@benjamingr
Copy link
Member

Right now there's a 1-to-1 correspondence between workers and their OS threads but there doesn't have to be.

Precisely that. That said it does sound genuinely useful for tooling authors to be able to do this regardless of whether the abstraction is "worker per thread" - I am just not sure if telling people "make a syscall" like in the OP is a bad solution.

@gireeshpunathil
Copy link
Member

  • isn't the one-worker-one-os-thread design there to live forever? as long as v8 prohibits an isolate to be operated by multiple native threads?

  • aren't all the OS's provide some form of tids ? at least I can see linux, windows , mac, and AIX have ways to get their identifiers. Not necessarily these ids are usable as is externally, but still provides a way to uniquely identify threads and make meaningful associations.

  • the only possible way to profile threads at present looks to be through reverse engineering?

@addaleax
Copy link
Member

  • isn't the one-worker-one-os-thread design there to live forever? as long as v8 prohibits an isolate to be operated by multiple native threads?

V8 does not prohibit that, and in fact provides APIs to allow it. It prohibits using an Isolate from multiple native threads concurrently or using multiple Isolates from one thread concurrently, though. This may never get implemented, but at the same time it’s not unthinkable to run e.g. multiple Workers on a shared native thread.

  • aren't all the OS's provide some form of tids ? at least I can see linux, windows , mac, and AIX have ways to get their identifiers. Not necessarily these ids are usable as is externally, but still provides a way to uniquely identify threads and make meaningful associations.

Fwiw, here’s what Chromium does:

https://chromium.googlesource.com/chromium/chromium/+/068885c2c5fda617e634bb73a107a0285af470ff/base/threading/platform_thread_posix.cc#156

I don’t think the return value of that is meaningful outside of the process on non-Linux platforms (and maybe also Windows, I don’t know about that).

  • the only possible way to profile threads at present looks to be through reverse engineering?

I don’t think there’s any reverse engineering involved in what OP is doing – doing a syscall to gettid() and using that to look up data in /proc/${pid} as suggested by OP (or using the also Linux-specific RUSAGE_THREAD with getrusage()) seems like a reasonable approach to me. I know it’s not ideal, but using native addons is kind of exactly what we’ve been recommending people to do in the past if they want features that are only really available on one platform.

That being said, if you are actually talking about profiling Worker threads, that’s already possible through the standard profiling measures like --prof.

@gireeshpunathil
Copy link
Member

thanks @addaleax . The only one point I want to clarify is profiling; I meant a wider scope for program analysis, not just CPU profiling.

@benjamingr
Copy link
Member

@addaleax from what I can tell Chromium calls GetCurrentThreadId here: https://chromium.googlesource.com/chromium/chromium/+/068885c2c5fda617e634bb73a107a0285af470ff/base/threading/platform_thread_win.cc#104 which seems to return a meaningful result on windows:

Until the thread terminates, the thread identifier uniquely identifies the thread throughout the system.

So it looks like this is doable in windows (and thus cross-platform) if we want to provide it in core - which I am not sure we do.

@addaleax
Copy link
Member

@benjamingr If you’re focused on the unique-identifier aspect, pthread_self() returns a value that is not guaranteed to be unique across processes. But if you want only that, then the pid + worker id pair might do a better job anyway.

To me it really seems more like OP is focused on using the thread id to gather meaningful information about the thread, and that is what I’m having a hard time seeing happening cross-platform. I think the closest we can get is something like the process.resourceUsage() API, where we’ve kind of accepted that a lot of the values returned by it are just going to be zero on a lot of platforms. But here it seems like only Linux would reasonably be supported, afaict.

@Fishrock123 Fishrock123 added the worker Issues and PRs related to Worker support. label Nov 19, 2019
@addaleax
Copy link
Member

@abbshr Any further thoughts on this, given the discussion above?

@abbshr
Copy link
Contributor Author

abbshr commented Dec 16, 2019

I think the closest we can get is something like the process.resourceUsage() API, where we’ve kind of accepted that a lot of the values returned by it are just going to be zero on a lot of platforms

@addaleax For me, I really prefer this solution, but it seems not to be an universal approach (not kind to other platform users).

So, I think maybe an addon is more flexible and suitable right now, unless the worker_thread module find a better way to support it.

@aixtools
Copy link

aixtools commented Jun 5, 2020

In the hope this helps, i.e., something new and useable - fyi - CPython had a discussion for adding OS support to get the "real" tid from it's "worker" concept - even if mono-threaded.

This is the "multi-platform" code - in Python/thread_pthread.h

#ifdef PY_HAVE_THREAD_NATIVE_ID
unsigned long
PyThread_get_thread_native_id(void)
{
    if (!initialized)
        PyThread_init_thread();
#ifdef __APPLE__
    uint64_t native_id;
    (void) pthread_threadid_np(NULL, &native_id);
#elif defined(__linux__)
    pid_t native_id;
    native_id = syscall(SYS_gettid);
#elif defined(__FreeBSD__)
    int native_id;
    native_id = pthread_getthreadid_np();
#elif defined(__OpenBSD__)
    pid_t native_id;
    native_id = getthrid();
#elif defined(_AIX)
    tid_t native_id;
    native_id = thread_self();
#elif defined(__NetBSD__)
    lwpid_t native_id;
    native_id = _lwp_self();
#endif
    return (unsigned long) native_id;
}
#endif

@github-actions
Copy link
Contributor

github-actions bot commented Mar 8, 2022

There has been no activity on this feature request for 5 months and it is unlikely to be implemented. It will be closed 6 months after the last non-automated comment.

For more information on how the project manages feature requests, please consult the feature request management document.

@github-actions github-actions bot added the stale label Mar 8, 2022
@github-actions
Copy link
Contributor

github-actions bot commented Apr 8, 2022

There has been no activity on this feature request and it is being closed. If you feel closing this issue is not the right thing to do, please leave a comment.

For more information on how the project manages feature requests, please consult the feature request management document.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Issues that request new features to be added to Node.js. stale worker Issues and PRs related to Worker support.
Projects
Development

No branches or pull requests

7 participants