dateparser package¶
Subpackages¶
Submodules¶
dateparser.conf module¶
dateparser
‘s parsing behavior can be configured like below
``PREFER_DAY_OF_MONTH`` defaults to current
and can have first
and last
as values:
>>> from dateparser.conf import settings
>>> from dateparser import parse
>>> parse(u'December 2015')
datetime.datetime(2015, 12, 16, 0, 0)
>>> settings.update('PREFER_DAY_OF_MONTH', 'last')
>>> parse(u'December 2015')
datetime.datetime(2015, 12, 31, 0, 0)
>>> settings.update('PREFER_DAY_OF_MONTH', 'first')
>>> parse(u'December 2015')
datetime.datetime(2015, 12, 1, 0, 0)
``PREFER_DATES_FROM`` defaults to current_period
and can have past
and future
as values.
Assuming current date is June 16, 2015:
>>> from dateparser.conf import settings
>>> from dateparser import parse
>>> parse(u'March')
datetime.datetime(2015, 3, 16, 0, 0)
>>> settings.update('PREFER_DATES_FROM', 'future')
>>> parse(u'March')
datetime.datetime(2016, 3, 16, 0, 0)
``SKIP_TOKENS`` is a list
of tokens to discard while detecting language. Defaults to ['t']
which skips T in iso format datetime string.e.g. 2015-05-02T10:20:19+0000
.
This only works with DateDataParser
like below:
>>> settings.update('SKIP_TOKENS', ['de']) # Turkish word for 'at'
>>> from dateparser.date import DateDataParser
>>> DateDataParser().get_date_data(u'27 Haziran 1981 de') # Turkish (at 27 June 1981)
{'date_obj': datetime.datetime(1981, 6, 27, 0, 0), 'period': 'day'}
dateparser.date module¶
-
class
dateparser.date.
DateDataParser
(languages=None, allow_redetect_language=False)[source]¶ Bases:
object
Class which handles language detection, translation and subsequent generic parsing of string representing date and/or time.
Parameters: - languages (list) – A list of two letters language codes, e.g. [‘en’, ‘es’]. If languages are given, it will not attempt to detect the language.
- allow_redetect_language (bool) – Enables/disables language re-detection.
Returns: A parser instance
Raises: ValueError - Unknown Language, TypeError - Languages argument must be a list
-
get_date_data
(date_string, date_formats=None)[source]¶ Parse string representing date and/or time in recognizable localized formats. Supports parsing multiple languages and timezones.
Parameters: - date_string (str|unicode) – A string representing date and/or time in a recognizably valid format.
- date_formats (list) – A list of format strings using directives as given here. The parser applies formats one by one, taking into account the detected languages.
Returns: a dict mapping keys to
datetime.datetime
object and period. For example: {‘date_obj’: datetime.datetime(2015, 6, 1, 0, 0), ‘period’: u’day’}Raises: ValueError - Unknown Language
Note
Period values can be a ‘day’ (default), ‘week’, ‘month’, ‘year’.
Period represents the granularity of date parsed from the given string.
In the example below, since no day information is present, the day is assumed to be current day
16
from current date (which is June 16, 2015, at the moment of writing this). Hence, the level of precision ismonth
.>>> DateDataParser().get_date_data(u'March 2015') {'date_obj': datetime.datetime(2015, 3, 16, 0, 0), 'period': u'month'}
Similarly, for date strings with no day and month information present, level of precision is
year
and day16
and month6
are from current_date.>>> DateDataParser().get_date_data(u'2014') {'date_obj': datetime.datetime(2014, 6, 16, 0, 0), 'period': u'year'}
- Dates with time zone indications or UTC offsets are returned in UTC time.
>>> DateDataParser().get_date_data(u'23 March 2000, 1:21 PM CET') {'date_obj': datetime.datetime(2000, 3, 23, 14, 21), 'period': 'day'}
-
language_loader
= <dateparser.languages.loader.LanguageDataLoader object>¶
dateparser.date_parser module¶
-
dateparser.date_parser.
dateutil_parse
(date_string, **kwargs)[source]¶ Wrapper function around dateutil.parser.parse
dateparser.freshness_date_parser module¶
dateparser.timezone_parser module¶
dateparser.timezones module¶
dateparser.utils module¶
Module contents¶
-
dateparser.
parse
(date_string, date_formats=None, languages=None)[source]¶ Parse date and time from given date string.
Parameters: - date_string (str|unicode) – A string representing date and/or time in a recognizably valid format.
- date_formats (list) –
A list of format strings using directives as given here. The parser applies formats one by one, taking into account the detected languages.
- languages (list) – A list of two letters language codes.e.g. [‘en’, ‘es’]. If languages are given, it will not attempt to detect the language.
Returns: Returns a
datetime.datetime
if successful, else returns NoneRaises: ValueError - Unknown Language