New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: Performance problems with queries to pg_catalog #57249
Comments
Hello, I am Blathers. I am here to help you get the issue triaged. Hoot - a bug! Though bugs are the bane of my existence, rest assured the wretched thing will get the best of care here. I have CC'd a few people who may be able to assist you:
If we have not gotten back to your issue within a few business days, you can try the following:
🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
Thanks for the report! I haven't been able to reproduce this locally. I created a local single-node v20.2.2 cluster with 10 tables, then added 100 more tables in a different database. In both cases, the query took under 20ms.
A couple questions for you @gyrter
|
Thank you for fast responce. I will try to help you with this issue.
Also, in different versions of CockroachDB this query has different query plans:
Main difference - query has join on pg_am.
All db instances run in same nodes in k8s. |
Hi. We use hibernate too, and for schema validating I see this performance issue: Also I see this in cockroach cloud too. Add issue https://support.cockroachlabs.com/hc/en-us/requests/6880 Add diagnostics details. |
I created a remote 3-node cluster running v20.2.2, and added 100 empty tables into it. Running against it still only takes 45ms. So unfortunately I still have not been able to reproduce the slowness. Thanks for sharing those explain plans. I think the plans actually are the same, though maybe the output format changed. I see both include the join on
the statement bundle you provided should be helpful. we will look into that next. |
@rafiss Hi, steps for reproduce:
Also I see slow working in TablePlus too Add more bundles for you stmt-bundle-612213293907738625.zip |
Thanks for these steps. I can reproduce more of the slowness now. I think the existence of more foreign keys makes these queries slower, so using the pgdump you provided was helpful. I believe the underlying cause is new validation logic in v20.2.x that is performed when fetching table descriptors. We fetch these descriptors every time any data from a pg_catalog table is read. We'll work on improving the performance and include the improvement in a patch release of 20.2 as soon as we can. |
Hi @jordanlewis, please add branch-* labels to identify which branch(es) this release-blocker affects. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
In 20.2 we added logic to validate descriptors on the read path for getting all descriptors and introduced a performance regression. In particular, we'd now retrieve all databases and cross references for all tables. This stricter validation has proven to be somewhat handy and has become utilized in some tests to ensure that descriptors indeed are valid. Eliminating this validation would thus be problematic. This is fortunately easy to resolve as we already have all of the descriptors sitting in memory. With this change, we use them. Fixes cockroachdb#57249. In the previously added benchmark tracking the number of KV calls, the code used to perform 32 reads and now perform 1 (the 32 mostly has to do with resolving the system database when validating system tables). Release note (bug fix): Fixed performance regression introduced in v20.2 to reading virtual tables which introspect the schema.
In 20.2 we added logic to validate descriptors on the read path for getting all descriptors and introduced a performance regression. In particular, we'd now retrieve all databases and cross references for all tables. This stricter validation has proven to be somewhat handy and has become utilized in some tests to ensure that descriptors indeed are valid. Eliminating this validation would thus be problematic. This is fortunately easy to resolve as we already have all of the descriptors sitting in memory. With this change, we use them. Fixes cockroachdb#57249. In the previously added benchmark tracking the number of KV calls, the code used to perform 32 reads and now perform 1 (the 32 mostly has to do with resolving the system database when validating system tables). Release note (bug fix): Fixed performance regression introduced in v20.2 to reading virtual tables which introspect the schema.
57231: roachtest: skip jobs/mixed-version r=dt a=pbardea This test runs import in a mixed-version setting, which is known to be not supported. This test is failing often enough that we're not getting good signal from it. Fixes #56038. Release note: None 57382: importccl: support IMPORT in mixed-version cluster r=dt a=pbardea This commit adds support for running IMPORT in a mixed-version cluster. Note, that it does not add support for running an IMPORT during the upgrade of a node however. The IMPORT job would need to be restarted in that case. Release note (sql change): Add support for running IMPORT in a mixed-version cluster. 57542: catalogkv: use in-memory descriptors to validate in GetAllDescriptors r=lucy-zhang a=ajwerner In 20.2 we added logic to validate descriptors on the read path for getting all descriptors and introduced a performance regression. In particular, we'd now retrieve all databases and cross references for all tables. This stricter validation has proven to be somewhat handy and has become utilized in some tests to ensure that descriptors indeed are valid. Eliminating this validation would thus be problematic. This is fortunately easy to resolve as we already have all of the descriptors sitting in memory. With this change, we use them. Fixes #57249. In the previously added benchmark tracking the number of KV calls, the code used to perform 32 reads and now perform 1 (the 32 mostly has to do with resolving the system database when validating system tables). Release note (bug fix): Fixed performance regression introduced in v20.2 to reading virtual tables which introspect the schema. Co-authored-by: Paul Bardea <pbardea@gmail.com> Co-authored-by: Andrew Werner <ajwerner@cockroachlabs.com>
In 20.2 we added logic to validate descriptors on the read path for getting all descriptors and introduced a performance regression. In particular, we'd now retrieve all databases and cross references for all tables. This stricter validation has proven to be somewhat handy and has become utilized in some tests to ensure that descriptors indeed are valid. Eliminating this validation would thus be problematic. This is fortunately easy to resolve as we already have all of the descriptors sitting in memory. With this change, we use them. Fixes cockroachdb#57249. In the previously added benchmark tracking the number of KV calls, the code used to perform 32 reads and now perform 1 (the 32 mostly has to do with resolving the system database when validating system tables). Release note (bug fix): Fixed performance regression introduced in v20.2 to reading virtual tables which introspect the schema.
Describe the problem
Hello. I migrated from CockroachDB 20.1.8 up to 20.2.2 and got problems with application start. Our ORM, PostgreSQL JDBC, creates on start queries to pg_catalog to determine types and relations. And this queries become very slow. We continued investigation and found, that responce time correlated with number of tables in DB. Also, I found that one node in cluster works well.
To Reproduce
I201130 10:52:27.993894 5920071 sql/exec_log.go:225 ⋮ [n3,client=‹10.111.7.3:38470›,hostssl,user=‹db1›] 3 112.396ms ‹exec› ‹"PostgreSQL JDBC Driver"› ‹{}› ‹"SELECT typinput = 'array_in'::REGPROC AS is_array, typtype, typname FROM pg_catalog.pg_type LEFT JOIN (SELECT ns.oid AS nspoid, ns.nspname, r.r FROM pg_namespace AS ns JOIN (SELECT s.r, (current_schemas(false))[s.r] AS nspname FROM ROWS FROM (generate_series(1, array_upper(current_schemas(false), 1))) AS s (r)) AS r USING (nspname)) AS sp ON sp.nspoid = typnamespace WHERE typname = $1 ORDER BY sp.r, pg_type.oid DESC"› ‹{$1:"'jsonb'"}› 1 ‹""› 0 ‹{ LATENCY_THRESHOLD }›
I201130 10:36:00.786376 5085793 sql/exec_log.go:225 ⋮ [n2,client=‹192.168.114.18:21850›,hostssl,user=‹db1›] 67 520.064ms ‹exec› ‹"PostgreSQL JDBC Driver"› ‹{}› ‹"SELECT typinput = 'array_in'::REGPROC AS is_array, typtype, typname FROM pg_catalog.pg_type LEFT JOIN (SELECT ns.oid AS nspoid, ns.nspname, r.r FROM pg_namespace AS ns JOIN (SELECT s.r, (current_schemas(false))[s.r] AS nspname FROM ROWS FROM (generate_series(1, array_upper(current_schemas(false), 1))) AS s (r)) AS r USING (nspname)) AS sp ON sp.nspoid = typnamespace WHERE typname = $1 ORDER BY sp.r, pg_type.oid DESC"› ‹{$1:"'jsonb'"}› 1 ‹""› 0 ‹{ LATENCY_THRESHOLD }›
Expected behavior
I expected better query time for catalog queries.
Environment:
The text was updated successfully, but these errors were encountered: