-
Notifications
You must be signed in to change notification settings - Fork 67
fix logger bug for batch issue upload #622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The line referred to in the error message is actually this one:
but it looks like the line you're currently repairing has the same problem. So probably just extend the fix to both locations? and hopefully the test you write will catch either of them. |
Not sure how I missed that, but I've extended the fix to line 95. Hopefully, that should cover everything! Testing the logger is turning out to be a bit of a challenge, so we will open a separate issue to track that. |
if issuedir_match and os.path.isdir(path): | ||
if issuedir_match: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interesting -- what will happen if path
is a text file called issue_12345678
? will the find_csv_files
call on line 96 handle that gracefully, or crash?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I take a try on this approach by picking up the issue_12345 by a text filed called issue_12345, here is some code of my approach:
path='prefix/to/the/data/issue_20200408.csv'
issuedir_match = CsvImporter.PATTERN_ISSUE_DIR.match(('.').join(path.split('.')[:-1]))
it will pass on the tests but will fail on integration tests. Because the change of form of issue_path will effect file load_sensors.py, so it will need some extra changes if I want to use the text file form as issue_123456.csv.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, i meant that we want the following:
- if someone mistakenly creates a file which matches
CsvImporter.PATTERN_ISSUE_DIR
but is not a directory - and we try to load the directory containing that bad file as a batch import
- then the code should ignore the bad file
This was originally accomplished using the and os.path.isdir(path)
clause. Without that clause, will the code crash in the above scenario?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha! I previously hadn't make dir path to a valid dir so the test won't pass. I already add back the check code and add one more test of the invalid dir path. One more question, can I know why in the function find_csv_files
check the csv path(the path format) instead to check the dir path? Should I add thisos.path.isdir(path)
also in the function find_csv_files
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
both find_issue_specific_csv_files
and find_csv_files
check that the paths they produce match a regular expression because the regex is used to extract information from the path (issue, source, time_value, geo_type, and signal).
find_issue_specific_csv_files
checks that the paths it produces are directories, because we've fed it bad input before by mistake 😉 . It doesn't get run very often, and each run has to be configured by hand, which means it's easy to accidentally do it wrong.
We run find_csv_files
24 times a day, so if we ever did accidentally feed it a filesystem that had directories with names that should have been normal files, or normal files with names that should have been directories, we would have fixed it in early summer 2020. All the things that feed find_csv_files
are automated and less susceptible to human error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
dev/docker/python/Dockerfile
Outdated
@@ -1,4 +1,8 @@ | |||
# start with the `delphi_python` image | |||
FROM delphi_python | |||
|
|||
# copy over all source files | |||
COPY repos repos | |||
RUN chmod -R o+r repos/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @sgratzl I added this to the Dockerfile to avoid having to repeatedly rebuild both delphi_python
and delphi_web_python
during development. Will this break CI or can I leave it in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why don't you just add this as a volume into the docker container? then you just have to restart the container. At least that is what I'm doing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still good, just dropped the unnecessary docker edit 👍
change logger.warning to logger.info to avoid error
still need to add/update unit and integration tests to catch this issue in the future