dateparser package¶
Subpackages¶
Submodules¶
dateparser.conf module¶
dateparser
‘s parsing behavior can be configured like below
``PREFER_DAY_OF_MONTH`` defaults to current
and can have first
and last
as values:
>>> from dateparser.conf import settings
>>> from dateparser import parse
>>> parse(u'December 2015')
datetime.datetime(2015, 12, 16, 0, 0)
>>> settings.update('PREFER_DAY_OF_MONTH', 'last')
>>> parse(u'December 2015')
datetime.datetime(2015, 12, 31, 0, 0)
>>> settings.update('PREFER_DAY_OF_MONTH', 'first')
>>> parse(u'December 2015')
datetime.datetime(2015, 12, 1, 0, 0)
``PREFER_DATES_FROM`` defaults to current_period
and can have past
and future
as values.
Assuming current date is June 16, 2015:
>>> from dateparser.conf import settings
>>> from dateparser import parse
>>> parse(u'March')
datetime.datetime(2015, 3, 16, 0, 0)
>>> settings.update('PREFER_DATES_FROM', 'future')
>>> parse(u'March')
datetime.datetime(2016, 3, 16, 0, 0)
dateparser.date module¶
-
class
dateparser.date.
DateDataParser
(languages=None, allow_redetect_language=False)[source]¶ Bases:
object
Class which handles language detection, translation and subsequent generic parsing of string representing date and/or time.
Parameters: - languages (list) – A list of two letters language codes.e.g. [‘en’, ‘es’]. If languages are given, it will not attempt to detect the language.
- allow_redetect_language (bool) – Enables/disables language re-detection.
Returns: A parser instance
Raises: ValueError - Unknown Language, TypeError - Languages argument must be a list
-
get_date_data
(date_string, date_formats=None)[source]¶ Parse string representing date and/or time in recognizeable localized formats. Supports parsing multiple languages.
Parameters: - date_string (str|unicode) – A string representing date and/or time in a recognizably valid format.
- date_formats (list) – A list of format strings using directives as given here. The parser applies formats one by one, taking into account the detected languages.
Returns: a dict mapping keys to
datetime.datetime
object and period. For example: {‘date_obj’: datetime.datetime(2015, 6, 1, 0, 0), ‘period’: u’day’}Raises: ValueError - Unknown Language
Note
Period values can be a ‘day’ (default), ‘week’, ‘month’, ‘year’.
Period represent the granularity of date parsed from the given string.
In the example below, since no day information is present, the day is assumed to be current day
16
from current date (which is June 16, 2015, at the moment of writing this). Hence, the level of precision ismonth
.>>> DateDataParser().get_date_data(u'March 2015') {'date_obj': datetime.datetime(2015, 3, 16, 0, 0), 'period': u'month'}
Similarly, for date strings with no day and month information present, level of precision is
year
and day16
and month6
are from current_date.>>> DateDataParser().get_date_data(u'2014') {'date_obj': datetime.datetime(2014, 6, 16, 0, 0), 'period': u'year'}
TODO: Timezone issues
dateparser.date_parser module¶
-
dateparser.date_parser.
dateutil_parse
(date_string, **kwargs)[source]¶ Wrapper function around dateutil.parser.parse
-
class
dateparser.date_parser.
new_parser
(info=None)[source]¶ Bases:
dateutil.parser.parser
Implements an alternate parse method which supports preference to dates in future and past. For more see issue #36
-
class
dateparser.date_parser.
new_relativedelta
(dt1=None, dt2=None, years=0, months=0, days=0, leapdays=0, weeks=0, hours=0, minutes=0, seconds=0, microseconds=0, year=None, month=None, day=None, weekday=None, yearday=None, nlyearday=None, hour=None, minute=None, second=None, microsecond=None)[source]¶ Bases:
dateutil.relativedelta.relativedelta
dateutil does not check if result of parsing weekday is in the future. Although items dates are already in the past, so we need to fix this particular case.
dateparser.freshness_date_parser module¶
dateparser.timezone_parser module¶
dateparser.timezones module¶
dateparser.utils module¶
Module contents¶
-
dateparser.
parse
(date_string, date_formats=None, languages=None)[source]¶ Parse date and time from given date string.
Parameters: - date_string (str|unicode) – A string representing date and/or time in a recognizably valid format.
- date_formats (list) –
A list of format strings using directives as given here. The parser applies formats one by one, taking into account the detected languages.
- languages (list) – A list of two letters language codes.e.g. [‘en’, ‘es’]. If languages are given, it will not attempt to detect the language.
Returns: Returns a
datetime.datetime
if successful, else returns NoneRaises: ValueError - Unknown Language