Skip to content

ENH: Add unixtime accessor to Timestamp and DatetimeIndex objects #43975

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
fergalm opened this issue Oct 11, 2021 · 5 comments
Closed

ENH: Add unixtime accessor to Timestamp and DatetimeIndex objects #43975

fergalm opened this issue Oct 11, 2021 · 5 comments
Labels
Enhancement Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@fergalm
Copy link

fergalm commented Oct 11, 2021

Pain Point

I feel that extracting unixtimes from date strings is less graceful than it could be, and relies on the user knowing more about the internal format of datetime objects than they should have to.

Currently (at least as of version 1.3.3), a user needs to cast a Timestamp object as an integer to extract the unixtime (in nanoseconds)

import pandas as pd
import numpy as np

x = pd.to_datetime("2020")
unixtime_sec = x.astype(np.int64) / 1e9

This makes sense if you understand how a Timestamp object is storing its information internally. However, I would argue that this requires the end-user to understand an implementation detail that they shouldn't need to. I shouldn't need to care whether the time is represented internally by unixtime, TAI, GPS time etc. I should be able to ask for unixtime without relying on knowledge of the implementation details.

I would also argue that the current approach also makes code harder to read. While other attributes of time can be obtained by asking for them directly, unixtime requires an more indirect request

hour_of_day = x.hour     #Easy to read
unixtime_ns = x.astype(np.int64)  #Harder to read 

To add to the confusion, the DatetimeIndex object requires the user needs to use view instead of astype (see #38544). I don't understand quite why the interface for Timestamps and DatetimeIndices needs to be different .

x = pd.date_range("2020", "2021")
unixtime_sec = x.view(np.int64) / 1e9

Proposed solution

Timestamps and DatetimeIndex objects should have a unixtime accessor consistent with the interface used to access hour of day, day of month, etc.

x = pd.to_datetime("2020")
hour_of_day = x.hour  # Currently exists
day_of_month = x.day   # Currently exists
unixtime_sec = x.unixtime  # This proposal

x = pd.date_range("2020", "2021")
unixtime_sec = x.unixtime  # This proposal

The accessor should return a floating point number for a timestamp, or an iterable (e.g a Series) of floating point numbers for a DatetimeIndex. This float should represent the number of seconds elapsed from the epoch. The unit should be seconds, not nanoseconds, to be consistent with the definition of unixtime (https://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xbd_chap04.html#tag_21_04_16).

API breaking implications

This will not break any other features of the API, although deprecating the astype() and view() accessors could be considered. If these accessors are deprecated, that would make changing the internal representation in a Timestamp object easier in future.

Describe alternatives you've considered

The current methods work, although they are unsatisfactory because

  1. They require the end user have knowledge of what is an implementation detail
  2. They result in code that is harder to read
  3. It is cumbersome to write code that works for both Timestamps and DatetimeIndex objects
@fergalm fergalm added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Oct 11, 2021
@jreback
Copy link
Contributor

jreback commented Oct 12, 2021

maybe +0.25 for this, would take a PR implementing it. use epochtime as that is the common nomenclature.

@fergalm
Copy link
Author

fergalm commented Oct 12, 2021

Thanks Jeff,

I wouldn't object to epochtime, but epoch is something of a generic word, and the "obvious" epoch can be field dependent. For example, most astronomers would say epoch time begins on 2000-01-01, or even 1950-01-01. unixtime is less ambiguous.

@mroeschke
Copy link
Member

This this the same request as #14772?

@fergalm
Copy link
Author

fergalm commented Oct 13, 2021

@mroeschke, yes, that does look similar. I didn't find it in my original search. I'm guessing the fact that the other ticket has been inactive since 2019 should be taken as an indication of its priority...

@mroeschke
Copy link
Member

mroeschke commented Oct 13, 2021

Thanks, going to close as a duplicate of #14772

As pandas is a volunteer project, features get implemented from the community instead of priority.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

No branches or pull requests

3 participants