-
Notifications
You must be signed in to change notification settings - Fork 3.9k
store,bulk: log when delaying AddSSTable, collect + log more timings in bulk-ingest #41196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
If the rate-limiting and back-pressure mechanisms kick in, they can dramatically delay requests in some cases. However there is currently it can be unclear that this is happening and the system may simply appear slow. Logging when requests are delayed by more than a second should help identify when this is the cause of slowness. Release note: none. Release justification: low-risk (logging only) change that could significantly help in diagnosing 'stuck' jobs based on logs (which often all we have to go on).
This tracks and logs time spent in the various stages of ingestion - sorting, splitting and flushing. This helps when trying to diagnose why a job is 'slow' or 'stuck'. Release note: none. Release justification: low-risk (logging only) changes that improve ability to diagnose problems.
ajwerner
approved these changes
Sep 30, 2019
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r1, 2 of 2 files at r2.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @lucy-zhang)
bors r+ |
craig bot
pushed a commit
that referenced
this pull request
Sep 30, 2019
40493: sql: Display inherited constraints in SHOW PARTITIONS r=andreimatei a=rohany SHOW PARTITIONS now displays the inherited zone configuration of the partitions in a separate column. To accomplish this, the crdb_internal.zones table now holds on to the inherited constraints of each zone in a separate column. Additionally, the crdb_internal.partitions table holds on to the zone_id and subzone_id of the zone configuration the partition refers to. These id's correspond to the zone configuration at the lowest point in that partitions "inheritance chain". Release justification: Adds a low risk, good to have UX feature. Fixes #40349. Release note (sql change): * SHOW PARTITIONS now displays inherited zone configurations. * Adds the zone_id, subzone_id columns to crdb_internal.partitions, which form a link to the corresponding zone config in crdb_internal.zones which apply to the partitions. * Rename the config_yaml, config_sql and config_proto columns in crdb_internal.zones to raw_config_yaml, raw_config_sql, raw_config_proto. * Add the columns full_config_sql and full_config_yaml to the crdb_internal.zones table which display the full/inherited zone configuration. 41138: movr: Add stats collection to movr workload run r=danhhz a=rohany This PR adds tracking stats for each kind of query in the movr workload so that output is displayed from cockroach workload run. Additionally, this refactors the movr workload to define the work as functions on a worker struct. This hopefully will avoid a common gotcha of having different workers sharing the same not threadsafe histograms object. Release justification: low risk nice to have feature Release note: None 41196: store,bulk: log when delaying AddSSTable, collect + log more timings in bulk-ingest r=dt a=dt storage: log when AddSSTable requests are delayed If the rate-limiting and back-pressure mechanisms kick in, they can dramatically delay requests in some cases. However there is currently it can be unclear that this is happening and the system may simply appear slow. Logging when requests are delayed by more than a second should help identify when this is the cause of slowness. Release note: none. Release justification: low-risk (logging only) change that could significantly help in diagnosing 'stuck' jobs based on logs (which often all we have to go on). bulk: track and log more timings This tracks and logs time spent in the various stages of ingestion - sorting, splitting and flushing. This helps when trying to diagnose why a job is 'slow' or 'stuck'. Release note: none. Release justification: low-risk (logging only) changes that improve ability to diagnose problems. Co-authored-by: Rohan Yadav <[email protected]> Co-authored-by: Rohan Yadav <[email protected]> Co-authored-by: David Taylor <[email protected]>
Build succeeded |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
storage: log when AddSSTable requests are delayed
If the rate-limiting and back-pressure mechanisms kick in, they can dramatically delay requests in some cases.
However there is currently it can be unclear that this is happening and the system may simply appear slow.
Logging when requests are delayed by more than a second should help identify when this is the cause of slowness.
Release note: none.
Release justification: low-risk (logging only) change that could significantly help in diagnosing 'stuck' jobs based on logs (which often all we have to go on).
bulk: track and log more timings
This tracks and logs time spent in the various stages of ingestion - sorting, splitting and flushing.
This helps when trying to diagnose why a job is 'slow' or 'stuck'.
Release note: none.
Release justification: low-risk (logging only) changes that improve ability to diagnose problems.