parser — dateutil 0.1.dev1+gc981f9c documentation (original) (raw)
This module offers a generic date/time string parser which is able to parse most known formats to represent a date and/or time.
This module attempts to be forgiving with regards to unlikely input formats, returning a datetime object even for dates which are ambiguous. If an element of a date/time stamp is omitted, the following rules are applied:
- If AM or PM is left unspecified, a 24-hour clock is assumed, however, an hour on a 12-hour clock (
0 <= hour <= 12) must be specified if AM or PM is specified. - If a time zone is omitted, a timezone-naive datetime is returned.
If any other elements are missing, they are taken from thedatetime.datetime object passed to the parameter default. If this results in a day number exceeding the valid number of days per month, the value falls back to the end of the month.
Additional resources about date/time string formats can be found below:
- A summary of the international standard date and time notation
- W3C Date and Time Formats
- Time Formats (Planetary Rings Node)
- CPAN ParseDate module
- Java SimpleDateFormat Class
Functions
parser.parse(parserinfo=None, **kwargs)[source]
Parse a string in one of the supported formats, using theparserinfo parameters.
Parameters:
- timestr – A string containing a date/time stamp.
- parserinfo – A parserinfo object containing parameters for the parser. If
None, the default arguments to the parserinfoconstructor are used.
The **kwargs parameter takes the following keyword arguments:
Parameters:
- default – The default datetime object, if this is a datetime object and not
None, elements specified intimestrreplace elements in the default object. - ignoretz – If set
True, time zones in parsed strings are ignored and a naivedatetimeobject is returned. - tzinfos –
Additional time zone names / aliases which may be present in the string. This argument maps time zone names (and optionally offsets from those time zones) to time zones. This parameter can be a dictionary with timezone aliases mapping time zone names to time zones or a function taking two parameters (tznameandtzoffset) and returning a time zone.
The timezones to which the names are mapped can be an integer offset from UTC in seconds or atzinfoobject.from dateutil.parser import parse
from dateutil.tz import gettz
tzinfos = {"BRST": -7200, "CST": gettz("America/Chicago")}
parse("2012-01-19 17:21:00 BRST", tzinfos=tzinfos)
datetime.datetime(2012, 1, 19, 17, 21, tzinfo=tzoffset(u'BRST', -7200))
parse("2012-01-19 17:21:00 CST", tzinfos=tzinfos)
datetime.datetime(2012, 1, 19, 17, 21,
tzinfo=tzfile('/usr/share/zoneinfo/America/Chicago'))
This parameter is ignored if ignoretz is set.
- dayfirst – Whether to interpret the first value in an ambiguous 3-integer date (e.g. 01/05/09) as the day (
True) or month (False). Ifyearfirstis set toTrue, this distinguishes between YDM and YMD. If set toNone, this value is retrieved from the currentparserinfo object (which itself defaults toFalse). - yearfirst – Whether to interpret the first value in an ambiguous 3-integer date (e.g. 01/05/09) as the year. If
True, the first number is taken to be the year, otherwise the last number is taken to be the year. If this is set toNone, the value is retrieved from the currentparserinfo object (which itself defaults toFalse). - fuzzy – Whether to allow fuzzy parsing, allowing for string like “Today is January 1, 2047 at 8:21:00AM”.
- fuzzy_with_tokens –
IfTrue,fuzzyis automatically set to True, and the parser will return a tuple where the first element is the parseddatetime.datetime datetimestamp and the second element is a tuple containing the portions of the string which were ignored:from dateutil.parser import parse
parse("Today is January 1, 2047 at 8:21:00AM", fuzzy_with_tokens=True)
(datetime.datetime(2047, 1, 1, 8, 21), (u'Today is ', u' ', u'at '))
Returns:
Returns a datetime.datetime object or, if thefuzzy_with_tokens option is True, returns a tuple, the first element being a datetime.datetime object, the second a tuple containing the fuzzy tokens.
Raises:
- ParserError – Raised for invalid or unknown string formats, if the provided
tzinfois not in a valid format, or if an invalid date would be created. - OverflowError – Raised if the parsed date exceeds the largest valid C integer on your system.
parser.isoparse(dt_str)
Parse an ISO-8601 datetime string into a datetime.datetime.
An ISO-8601 datetime string consists of a date portion, followed optionally by a time portion - the date and time portions are separated by a single character separator, which is T in the official standard. Incomplete date formats (such as YYYY-MM) may not be combined with a time portion.
Supported date formats are:
Common:
YYYYYYYY-MMYYYY-MM-DDorYYYYMMDD
Uncommon:
YYYY-WwworYYYYWww- ISO week (day defaults to 0)YYYY-Www-DorYYYYWwwD- ISO week and day
The ISO week and day numbering follows the same logic asdatetime.date.isocalendar().
Supported time formats are:
hhhh:mmorhhmmhh:mm:ssorhhmmsshh:mm:ss.ssssss(Up to 6 sub-second digits)
Midnight is a special case for hh, as the standard supports both 00:00 and 24:00 as a representation. The decimal separator can be either a dot or a comma.
Caution
Support for fractional components other than seconds is part of the ISO-8601 standard, but is not currently implemented in this parser.
Supported time zone offset formats are:
- Z (UTC)
- ±HH:MM
- ±HHMM
- ±HH
Offsets will be represented as dateutil.tz.tzoffset objects, with the exception of UTC, which will be represented asdateutil.tz.tzutc. Time zone offsets equivalent to UTC (such as +00:00) will also be represented as dateutil.tz.tzutc.
Parameters:
dt_str – A string or stream containing only an ISO-8601 datetime string
Returns:
Returns a datetime.datetime representing the string. Unspecified components default to their lowest value.
Warning
As of version 2.7.0, the strictness of the parser should not be considered a stable part of the contract. Any valid ISO-8601 string that parses correctly with the default settings will continue to parse correctly in future versions, but invalid strings that currently fail (e.g. 2017-01-01T00:00+00:00:00) are not guaranteed to continue failing in future versions if they encode a valid date.
Added in version 2.7.0.
Classes
class dateutil.parser.parserinfo(dayfirst=False, yearfirst=False)[source]
Class which handles what inputs are accepted. Subclass this to customize the language and acceptable values for each parameter.
Parameters:
- dayfirst – Whether to interpret the first value in an ambiguous 3-integer date (e.g. 01/05/09) as the day (
True) or month (False). Ifyearfirstis set toTrue, this distinguishes between YDM and YMD. Default isFalse. - yearfirst – Whether to interpret the first value in an ambiguous 3-integer date (e.g. 01/05/09) as the year. If
True, the first number is taken to be the year, otherwise the last number is taken to be the year. Default isFalse.
AMPM = [('am', 'a'), ('pm', 'p')]
HMS = [('h', 'hour', 'hours'), ('m', 'minute', 'minutes'), ('s', 'second', 'seconds')]
JUMP = [' ', '.', ',', ';', '-', '/', "'", 'at', 'on', 'and', 'ad', 'm', 't', 'of', 'st', 'nd', 'rd', 'th']
MONTHS = [('Jan', 'January'), ('Feb', 'February'), ('Mar', 'March'), ('Apr', 'April'), ('May', 'May'), ('Jun', 'June'), ('Jul', 'July'), ('Aug', 'August'), ('Sep', 'Sept', 'September'), ('Oct', 'October'), ('Nov', 'November'), ('Dec', 'December')]
PERTAIN = ['of']
TZOFFSET = {}
UTCZONE = ['UTC', 'GMT', 'Z', 'z']
WEEKDAYS = [('Mon', 'Monday'), ('Tue', 'Tuesday'), ('Wed', 'Wednesday'), ('Thu', 'Thursday'), ('Fri', 'Friday'), ('Sat', 'Saturday'), ('Sun', 'Sunday')]
convertyear(year, century_specified=False)[source]
Converts two-digit years to year within [-50, 49] range of self._year (current local time)
Warnings and Exceptions
class dateutil.parser.ParserError[source]
Exception subclass used for any failure to parse a datetime string.
This is a subclass of ValueError, and should be raised any time earlier versions of dateutil would have raised ValueError.
Added in version 2.8.1.
class dateutil.parser.UnknownTimezoneWarning[source]
Raised when the parser finds a timezone it cannot parse into a tzinfo.
Added in version 2.7.0.