dateparser package

Subpackages

Submodules

dateparser.conf module

exception dateparser.conf.SettingValidationError[source]

Bases: ValueError

class dateparser.conf.Settings(*args, **kwargs)[source]

Bases: object

Control and configure default parsing behavior of dateparser. Currently, supported settings are:

  • DATE_ORDER

  • PREFER_LOCALE_DATE_ORDER

  • TIMEZONE

  • TO_TIMEZONE

  • RETURN_AS_TIMEZONE_AWARE

  • PREFER_MONTH_OF_YEAR

  • PREFER_DAY_OF_MONTH

  • PREFER_DATES_FROM

  • RELATIVE_BASE

  • STRICT_PARSING

  • REQUIRE_PARTS

  • SKIP_TOKENS

  • NORMALIZE

  • RETURN_TIME_AS_PERIOD

  • PARSERS

  • DEFAULT_LANGUAGES

  • LANGUAGE_DETECTION_CONFIDENCE_THRESHOLD

  • CACHE_SIZE_LIMIT

classmethod get_key(settings=None)[source]
replace(mod_settings=None, **kwds)[source]
dateparser.conf.apply_settings(f)[source]
dateparser.conf.check_settings(settings)[source]

Check if provided settings are valid, if not it raises SettingValidationError. Only checks for the modified settings.

dateparser.date module

class dateparser.date.DateData(*, date_obj=None, period=None, locale=None)[source]

Bases: object

Class that represents the parsed data with useful information. It can be accessed with square brackets like a dict object.

class dateparser.date.DateDataParser(languages=None, locales=None, region=None, try_previous_locales=False, use_given_order=False, settings=None, detect_languages_function=None)[source]

Bases: object

Class which handles language detection, translation and subsequent generic parsing of string representing date and/or time.

Parameters:
  • languages (list) – A list of language codes, e.g. [‘en’, ‘es’, ‘zh-Hant’]. If locales are not given, languages and region are used to construct locales for translation.

  • locales (list) – A list of locale codes, e.g. [‘fr-PF’, ‘qu-EC’, ‘af-NA’]. The parser uses only these locales to translate date string.

  • region (str) – A region code, e.g. ‘IN’, ‘001’, ‘NE’. If locales are not given, languages and region are used to construct locales for translation.

  • try_previous_locales (bool) – If True, locales previously used to translate date are tried first.

  • use_given_order (bool) – If True, locales are tried for translation of date string in the order in which they are given.

  • settings (dict) – Configure customized behavior using settings defined in dateparser.conf.Settings.

  • detect_languages_function (function) – A function for language detection that takes as input a text and a confidence_threshold, and returns a list of detected language codes. Note: this function is only used if languages and locales are not provided.

Returns:

A parser instance

Raises:

ValueError: Unknown Language, TypeError: Languages argument must be a list, SettingValidationError: A provided setting is not valid.

get_date_data(date_string, date_formats=None)[source]

Parse string representing date and/or time in recognizable localized formats. Supports parsing multiple languages and timezones.

Parameters:
  • date_string (str) – A string representing date and/or time in a recognizably valid format.

  • date_formats (list) – A list of format strings using directives as given here. The parser applies formats one by one, taking into account the detected languages.

Returns:

a DateData object.

Raises:

ValueError - Unknown Language

Note

Period values can be a ‘day’ (default), ‘week’, ‘month’, ‘year’, ‘time’.

Period represents the granularity of date parsed from the given string.

In the example below, since no day information is present, the day is assumed to be current day 16 from current date (which is June 16, 2015, at the moment of writing this). Hence, the level of precision is month:

>>> DateDataParser().get_date_data('March 2015')
DateData(date_obj=datetime.datetime(2015, 3, 16, 0, 0), period='month', locale='en')

Similarly, for date strings with no day and month information present, level of precision is year and day 16 and month 6 are from current_date.

>>> DateDataParser().get_date_data('2014')
DateData(date_obj=datetime.datetime(2014, 6, 16, 0, 0), period='year', locale='en')

Dates with time zone indications or UTC offsets are returned in UTC time unless specified using Settings.

>>> DateDataParser().get_date_data('23 March 2000, 1:21 PM CET')
DateData(date_obj=datetime.datetime(2000, 3, 23, 13, 21, tzinfo=<StaticTzInfo 'CET'>),
period='day', locale='en')
get_date_tuple(*args, **kwargs)[source]
locale_loader = None
dateparser.date.date_range(begin, end, **kwargs)[source]
dateparser.date.get_date_from_timestamp(date_string, settings, negative=False)[source]
dateparser.date.get_intersecting_periods(low, high, period='day')[source]
dateparser.date.parse_with_formats(date_string, date_formats, settings)[source]

Parse with formats and return a dictionary with ‘period’ and ‘obj_date’.

Returns:

datetime.datetime, dict or None

dateparser.date.sanitize_date(date_string)[source]
dateparser.date.sanitize_spaces(date_string)[source]

dateparser.date_parser module

class dateparser.date_parser.DateParser[source]

Bases: object

parse(date_string, parse_method, settings=None)[source]

dateparser.freshness_date_parser module

class dateparser.freshness_date_parser.FreshnessDateDataParser[source]

Bases: object

Parses date string like “1 year, 2 months ago” and “3 hours, 50 minutes ago”

get_date_data(date_string, settings=None)[source]
get_kwargs(date_string)[source]
get_local_tz()[source]
parse(date_string, settings)[source]

dateparser.timezone_parser module

class dateparser.timezone_parser.StaticTzInfo(name, offset)[source]

Bases: tzinfo

dst(dt)[source]

datetime -> DST offset as timedelta positive east of UTC.

localize(dt, is_dst=False)[source]
tzname(dt)[source]

datetime -> string name of time zone.

utcoffset(dt)[source]

datetime -> timedelta showing offset from UTC, negative values indicating West of UTC

dateparser.timezone_parser.build_tz_offsets(search_regex_parts)[source]
dateparser.timezone_parser.convert_to_local_tz(datetime_obj, datetime_tz_offset)[source]
dateparser.timezone_parser.get_local_tz_offset()[source]
dateparser.timezone_parser.pop_tz_offset_from_string(date_string, as_offset=True)[source]
dateparser.timezone_parser.word_is_tz(word)[source]

dateparser.timezones module

dateparser.utils module

dateparser.utils.apply_dateparser_timezone(utc_datetime, offset_or_timezone_abb)[source]
dateparser.utils.apply_timezone(date_time, tz_string)[source]
dateparser.utils.apply_timezone_from_settings(date_obj, settings)[source]
dateparser.utils.apply_tzdatabase_timezone(date_time, pytz_string)[source]
dateparser.utils.combine_dicts(primary_dict, supplementary_dict)[source]
dateparser.utils.find_date_separator(format)[source]
dateparser.utils.get_last_day_of_month(year, month)[source]
dateparser.utils.get_logger()[source]
dateparser.utils.get_next_leap_year(year)[source]
dateparser.utils.get_previous_leap_year(year)[source]
dateparser.utils.get_timezone_from_tz_string(tz_string)[source]
dateparser.utils.localize_timezone(date_time, tz_string)[source]
dateparser.utils.normalize_unicode(string, form='NFKD')[source]
dateparser.utils.registry(cls)[source]
dateparser.utils.set_correct_day_from_settings(date_obj, settings, current_day=None)[source]

Set correct day attending the PREFER_DAY_OF_MONTH setting.

dateparser.utils.set_correct_month_from_settings(date_obj, settings, current_month=None)[source]

Set correct month attending the PREFER_MONTH_OF_YEAR setting.

dateparser.utils.setup_logging()[source]
dateparser.utils.strip_braces(date_string)[source]

Module contents

dateparser.parse(date_string, date_formats=None, languages=None, locales=None, region=None, settings=None, detect_languages_function=None)[source]

Parse date and time from given date string.

Parameters:
  • date_string (str) – A string representing date and/or time in a recognizably valid format.

  • date_formats (list) –

    A list of format strings using directives as given here. The parser applies formats one by one, taking into account the detected languages/locales.

  • languages (list) – A list of language codes, e.g. [‘en’, ‘es’, ‘zh-Hant’]. If locales are not given, languages and region are used to construct locales for translation.

  • locales (list) – A list of locale codes, e.g. [‘fr-PF’, ‘qu-EC’, ‘af-NA’]. The parser uses only these locales to translate date string.

  • region (str) – A region code, e.g. ‘IN’, ‘001’, ‘NE’. If locales are not given, languages and region are used to construct locales for translation.

  • settings (dict) – Configure customized behavior using settings defined in dateparser.conf.Settings.

  • detect_languages_function (function) – A function for language detection that takes as input a string (the date_string) and a confidence_threshold, and returns a list of detected language codes. Note: this function is only used if languages and locales are not provided.

Returns:

Returns datetime representing parsed date if successful, else returns None

Return type:

datetime.

Raises:

ValueError: Unknown Language, TypeError: Languages argument must be a list, SettingValidationError: A provided setting is not valid.