mrjob.parse - log parsing¶
Utilities for parsing errors, counters, and status messages.
-
mrjob.parse.
is_s3_uri
(uri)¶ Return True if uri can be parsed into an S3 URI, False otherwise.
-
mrjob.parse.
is_uri
(uri)¶ Return True if uri is a URI and contains
://
(we only care about URIs that can describe files)
-
mrjob.parse.
parse_mr_job_stderr
(stderr, counters=None)¶ Parse counters and status messages out of MRJob output.
Parameters: - stderr – a filehandle, a list of lines (bytes), or bytes
- counters – Counters so far, to update; a map from group (string to counter name (string) to count.
Returns a dictionary with the keys counters, statuses, other:
- counters: counters so far; same format as above
- statuses: a list of status messages encountered
- other: lines (strings) that aren’t either counters or status messages
-
mrjob.parse.
parse_s3_uri
(uri)¶ Parse an S3 URI into (bucket, key)
>>> parse_s3_uri('s3://walrus/tmp/') ('walrus', 'tmp/')
If
uri
is not an S3 URI, raise a ValueError
-
mrjob.parse.
to_uri
(path_or_uri)¶ If path_or_uri is not a URI already, convert it to a
file:///
URI.