This repository was archived by the owner on Feb 10, 2021. It is now read-only.
File tree 4 files changed +132
-130
lines changed
4 files changed +132
-130
lines changed Original file line number Diff line number Diff line change @@ -9,10 +9,10 @@ ENV PATH /opt/conda/bin:$PATH
9
9
10
10
# hdfs3 - python
11
11
ENV LIBHDFS3_CONF /etc/hadoop/conf/hdfs-site.xml
12
- RUN conda install -y -q ipython pytest
12
+ RUN conda install -y -q ipython pytest locket
13
13
RUN conda install -y -q libhdfs3 -c conda-forge
14
14
RUN conda create -y -n py3 python=3
15
- RUN conda install -y -n py3 ipython pytest
15
+ RUN conda install -y -n py3 ipython pytest locket
16
16
RUN conda install -y -n py3 libhdfs3 -c conda-forge
17
17
18
18
# Cloudera repositories
Original file line number Diff line number Diff line change @@ -6,24 +6,6 @@ Forked processes
6
6
7
7
The ``libhdfs3 `` library may fail when an ``HDFileSystem `` is copied to a new
8
8
forked process. This happens in some cases when using ``hdfs3 `` with
9
- ``multiprocessing `` in Python 2. Common solutions include the following:
9
+ ``multiprocessing `` in Python 2.
10
10
11
- * Use threads
12
- * Use Python 3 and a multiprocessing context using spawn with
13
- ``multiprocessing.get_context(method='spawn') `` see `multiprocessing docs `_
14
- * Only instantiate ``HDFileSystem `` within the forked processes, do not start
15
- an ``HDFileSystem `` within the parent processes or do not use that
16
- ``HDFileSystem `` within the child processes.
17
- * Use a file based lock. We recommend ``locket ``::
18
-
19
- $ pip install locket
20
-
21
- .. code-block:: python
22
-
23
- import locket
24
-
25
- with locket.lock_file('.lock'):
26
- # do hdfs3 work
27
-
28
-
29
- .. _`multiprocessing docs` : https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods
11
+ We get around this by using file-based locks, which slightly limit concurrency.
You can’t perform that action at this time.
0 commit comments