Skip to content

Date bin shift #1201

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Nov 28, 2016
Merged

Date bin shift #1201

merged 6 commits into from
Nov 28, 2016

Conversation

alexcjohnson
Copy link
Collaborator

Fixes #1151

Implements an auto-shifting algorithm for nonuniform (n-month) bins in date histograms. Covers 3 cases:

  • Most data is on Jan 1 with no time -> shift bins to start July 2 12:00, which is exactly halfway through a non leap year.
  • Most data is on the first of months with no time -> shift bins to start on the 16th of every month at noon. This is exactly halfway through 31-day months, which ends up meaning that any month after one with 31 days, the hovertext will say the first of that month, months after shorter months the month label will always be correct, but the date/time will be 12-36 hours into that month.
  • Most data has no time, only date -> shift bins 12 hours.

A couple of comments

  • "Most" was semi-arbitrarily taken to mean 80% of non-blank data. That's a much stronger criterion than we used for numeric bin shifting (there it's >30% near the edges and <10% near the middles, and "near" wasn't exact like it is here) because I figured with date data you basically either have a level of precision (months, days, hours) as part of your data or you don't, whereas numeric data has a lot more variety in the levels of precision that exist.
  • In the future, we might want to change the hover format for the date axis if we've found a histogram like this. This might actually be a more generic situation, where we can look at all the data on that axis and see what level of precision it has. Beyond the scope of this PR.

* and ticks on these days incrementing by month would be very unusual
*/
var y = new Date(x + THREEDAYS);
return y.setUTCMonth(y.getUTCMonth() + dtSigned) - THREEDAYS;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etpinard FYI - this is a slightly breaking change... if anyone explicitly asked for n-month ticks starting on the 26th, 27th, or 28th of a month and the ticks continue past a February, subsequent ticks will be shifted to the last few days of subsequent months. I think it's fairly unlikely this has ever been done, and even if it has it was probably in order to put ticks near the end of the month rather than on that specific date.

Anyway I think the advantages far outweigh this risk.

  • month ticks will never accidentally shift into the wrong month, even if, as a result of this change, they will sometimes change dates (which they did before anyway)
  • you can now put ticks in the last couple of days of the month and have them stay there.

It's probably not clear from this comment why I did this in this PR: I needed it for month bins with exactDay data, where the first bin should start 12 hours before the beginning of each month.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway I think the advantages far outweigh this risk.

👍

@@ -507,7 +507,6 @@ axes.autoBin = function(data, ax, nbins, is2d) {
var distinctData = Lib.distinctVals(data),
msexp = Math.pow(10, Math.floor(
Math.log(distinctData.minDiff) / Math.LN10)),
// TODO: there are some date cases where this will fail...
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed because I have no idea anymore what this was about. We'll wait for a bug report...

@@ -91,8 +91,7 @@ describe('Test histogram2d', function() {

// TODO: even though the binning is done on non-uniform bins,
// the display makes them linear (using only y0 and dy)
// when we sort out https://github.com/plotly/plotly.js/issues/1151
// lets also make it display the bins with nonuniform size
// Can we also make it display the bins with nonuniform size?
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed from the scope of this PR because it turned out to be more of a pain than I thought, and it's a really minor effect visually.

@@ -156,7 +156,7 @@ describe('Test histogram', function() {
nbinsx: 4
});

// Note: this gives half-day gaps between all but the first two
// TODO: this gives half-day gaps between all but the first two
// bars. Now that we have explicit per-bar positioning, perhaps
// we should fill the space, rather than insisting on equal-width
// bars?
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This effect is a little more important than getting the exact right bin sizes in the 2D case, because here it leads to visible gaps between many of the bars. Unfortunately it's also quite a bit more effort to solve when you consider stacked or grouped traces. Worth making an issue for it?

@etpinard etpinard added this to the v1.21.0 milestone Nov 28, 2016
Copy link
Contributor

@etpinard etpinard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. 💃

}
else {
// month ticks - should be the only nonlinear kind we have at this point.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... after traces/histogram/clean_bins.js that is, correct?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only relevant to auto-binning, where nonlinear ticks are dtick values... dtick only has special meaning on date axes (where 'M<n>' is n-months) and on log axes, but when you display a histogram on log axes we still bin on a linear scale.


// unmber of data points that needs to be an exact value
// to shift that increment to (near) the bin center
var threshold = 0.8 * dataCount;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non- ⛔ It would be nice to put these auto-algo threshold fractions in a constants.js file somewhere. It's not obvious where that file should be though.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I'll leave this as is for now but I'm happy to change it if you tell me where to put them. I don't have a good feel for what would make this easy to grok.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No action required in this PR.

* and ticks on these days incrementing by month would be very unusual
*/
var y = new Date(x + THREEDAYS);
return y.setUTCMonth(y.getUTCMonth() + dtSigned) - THREEDAYS;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway I think the advantages far outweigh this risk.

👍

// first row of z below)
expect(out.y0).toBe('1969-07-02 14:24');
expect(out.dy).toBe(365.2 * oneDay);
// Can we also make it display the bins with nonuniform size?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ref #360

@alexcjohnson alexcjohnson merged commit 2d20123 into master Nov 28, 2016
@alexcjohnson alexcjohnson deleted the date-bin-shift branch November 28, 2016 18:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature something new
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants