allennlp.common.file_utils¶
Utilities for working with the local dataset cache.
-
allennlp.common.file_utils.
cached_path
(url_or_filename: Union[str, pathlib.Path], cache_dir: str = None) → str[source]¶ Given something that might be a URL (or might be a local path), determine which. If it’s a URL, download the file and cache it, and return the path to the cached file. If it’s already a local path, make sure the file exists and then return the path.
-
allennlp.common.file_utils.
filename_to_url
(filename: str, cache_dir: str = None) → Tuple[str, str][source]¶ Return the url and etag (which may be
None
) stored for filename. RaiseFileNotFoundError
if filename or its stored metadata do not exist.
-
allennlp.common.file_utils.
get_from_cache
(url: str, cache_dir: str = None) → str[source]¶ Given a URL, look for the corresponding dataset in the local cache. If it’s not there, download it. Then return the path to the cached file.
-
allennlp.common.file_utils.
is_url_or_existing_file
(url_or_filename: Union[str, pathlib.Path, NoneType]) → bool[source]¶ Given something that might be a URL (or might be a local path), determine check if it’s url or an existing file path.
-
allennlp.common.file_utils.
read_set_from_file
(filename: str) → Set[str][source]¶ Extract a de-duped collection (set) of text from a file. Expected file format is one item per line.
-
allennlp.common.file_utils.
s3_etag
(url: str) → Union[str, NoneType][source]¶ Check ETag on S3 object.
-
allennlp.common.file_utils.
s3_get
(url: str, temp_file: <class 'IO'>) → None[source]¶ Pull a file directly from S3.
-
allennlp.common.file_utils.
s3_request
(func: Callable)[source]¶ Wrapper function for s3 requests in order to create more helpful error messages.
-
allennlp.common.file_utils.
session_with_backoff
() → requests.sessions.Session[source]¶ We ran into an issue where http requests to s3 were timing out, possibly because we were making too many requests too quickly. This helper function returns a requests session that has retry-with-backoff built in. see stackoverflow.com/questions/23267409/how-to-implement-retry-mechanism-into-python-requests-library