The format and parse engine

Table Of Contents

Pluggable format engine
Works with every chronological entity
Modify a formatter by format attributes
Using multiple formats during parsing
How to handle timezones in formatting?
When printing
When parsing
Other specialized topics
Adjacent digit parsing
Handling of ordinal numbers
Partial formatting and the class ElementPosition
Combining formatters
How to format/parse historical dates?

Pluggable format engine

Users have the choice to use at least two different format engines. Both engines use the common facade TemporalFormatter. The package net.time4j.format.platform contains simple formatters which use platform-specific resources. The class SimpleFormatter is just delegating to java.text.SimpleDateFormat in a thread-safe way. This engine only supports the four basic types of Time4J (and ZonalDateTime too).

However, for any kind of higher requirements to quality, special features and performance users are strongly advised to use the expert engine choosing the pattern class PatternType. All following paragraphs of this page refer to the expert engine.

Works with every chronological entity

The built-in format and parse engine of Time4J is designed to work with any arbitrary chronological entity and does not need to know at compile-time the concrete type of a time value to be formatted or parsed. At the core of this engine you find the class ChronoFormatter. It is immutable so you can safely store an instance in a static constant for maximum performance - even in a multi-thread-environment. You can achieve or construct an instance of this class telling Time4J which temporal type you want to format by following ways:

Modify a formatter by format attributes

A format attribute is a key-value-pair where the key is an implementation of the interface AttributeKey. This key determines the type of the associated attribute value which must be immutable. Most attributes use booleans or enums. Predefined attribute keys can be found in the class Attributes. You can apply a format attribute on the whole formatter by calling one of its with({Attribute-key}, {attribute-value})-methods. Example for a formatter which shall tolerate trailing characters during parsing:

  ChronoFormatter formatter =
      ChronoFormatter.setUp(PlainTime.class, Locale.US)
      .addInteger(PlainTime.CLOCK_HOUR_OF_AMPM, 1, 2)
      .addLiteral(' ')
      .addText(PlainTime.AM_PM_OF_DAY)
      .padPrevious(3)
      .addFixedInteger(PlainTime.MINUTE_OF_HOUR, 2)
      .build()
      .with(Attributes.TRAILING_CHARACTERS, true);

  System.out.println(formatter.parse("5 PM 45xyz"));
  // Result: T17:45

Using multiple formats during parsing

If you are not sure which concrete format to apply during parsing you can best use this approach for maximum performance. An example for using either the german or the american form of a date format (here fortunately distinguishable by different separator literals):

  static final MultiFormatParser<PlainDate> MULTI_FORMAT_PARSER;

  static {
    ChronoFormatter<PlainDate> germanStyle = 
      ChronoFormatter.ofDatePattern("dd.MM.uuuu", PatternType.CLDR, Locale.GERMAN);
    ChronoFormatter<PlainDate> usStyle = 
      ChronoFormatter.ofDatePattern("MM/dd/uuuu", PatternType.CLDR, Locale.US);
    MULTI_FORMAT_PARSER = MultiFormatParser.of(germanStyle, usStyle);
  }

  public List<PlainDate> parse() throws ParseException {
    String[] input = {"11.09.2001", "09/11/2001"};
    List<PlainDate> dates = new ArrayList<>();
    ParseLog plog = new ParseLog();

    for (String s : input) {
      plog.reset(); // initialization
      PlainDate date = MULTI_FORMAT_PARSER.parse(s, plog);

      if (date == null || plog.isError()) {
        System.out.println(
          "Wrong entry found: " + s + " at position " + dates.size() 
          + ", error-message=" + plog.getErrorMessage());
      } else {
         dates.add(date);
      }
    }

    return dates;
  }

Note that this approach avoids throwing and catching internal exceptions in most cases which is important in case of processing bulk data. Furthermore, if you have only one locale then you might also consider the option to use a single format pattern string using the "|"-operator to simultaneously specify different patterns.

How to handle timezones in formatting?

Combined representations of date-times with timezone names or offsets can be processed, too. The right temporal type for this purpose is the class Moment. However, this type does not have its own timezone for format purposes because it is just fixed to UTC+00:00 for all time calculations and only has its own fixed UTC-representation. Instead you have to supply an extra timezone to the formatter to make it working for any arbitrary timezone deviating from UTC.

When printing

You just need to enhance any ChronoFormatter<Moment> with a timezone. Then the formatter will use this timezone to convert a given Moment to a zonal timestamp representation and - if specified in the format - use it for printing the timezone id, name or offset. Example using first the system timezone and then overriding it:

  TemporalFormatter<Moment> fmt = 
    ChronoFormatter.ofMomentPattern("HH:mm z", PatternType.CLDR, Locale.getDefault(), Timezone.ofSystem().getID());
  Moment now = SystemClock.currentMoment();

  System.out.println(fmt.format(now)); 
  // output in New York (US) as system timezone => 17:45 EDT

  System.out.println(fmt.withTimezone(AMERICA.LOS_ANGELES).format(now)); 
  // output in Los Angeles (US) => 14:45 PDT

When parsing

The timezone attribute on the formatter serves here only as fallback if no suitable timezone information can be found in the input to be parsed. Therefore you can parse any timezone related input to a Moment without specifying a timezone provided the input has a timezone information or an offset. The resulting Moment does NOT have this zone information. However, if you also want to know what the parsed timezone was then you can do one of following things:

a) Using the class ParseLog

  ChronoFormatter<Moment> fmt = 
    ChronoFormatter.setUp(Moment.class, Locale.US)
    .addPattern("MM/dd/uuuu HH:mmxxx", PatternType.CLDR)
    .build().withTimezone(AMERICA.NEW_YORK);
  ParseLog plog = new ParseLog();
  Moment result = fmt.parse("09/11/2001 17:45-04:00", plog);
  System.out.println(result); // 2001-09-11T21:45:00Z
  System.out.println(plog.getRawValues().getTimezone()); // -04:00

b) Using the class ZonalDateTime

  TemporalFormatter<Moment> fmt = 
    ChronoFormatter.ofMomentPattern("MM/dd/uuuu HH:mmxxx", PatternType.CLDR, Locale.US, AMERICA.NEW_YORK);
  ZonalDateTime result = ZonalDateTime.parse("09/11/2001 17:45-04:00", fmt);
  System.out.println(result); // 2001-09-11T17:45-04:00

Other specialized topics

Adjacent digit parsing

If two or more chronological elements with numerical representations follow each other with no literal between then the question arises how the parse engine of Time4J can recognize when one numerical element stops and the next one begins. Time4J resolves this problem by automatically swapping to a special mode called "adjacent digit parsing" if following conditions hold:

All numerical elements in a given sequence but the first one have fixed width. This will be the case as long as you use for all following elements after the first element builder methods like addFixedInteger(...). This mechanism includes all types of numers, namely integers, decimals or fractions. Note that fractional elements have fixed width if minimum and maximum width are the same. Time4J will then parse the first variable-width-element with the reserved width of the following elements in mind.

  ChronoFormatter<PlainTimestamp> formatter =
    ChronoFormatter
      .setUp(PlainTimestamp.class, Locale.ROOT)
      .addInteger(PlainDate.YEAR, 4, 9)
      .addFixedNumerical(PlainDate.MONTH_OF_YEAR, 2)
      .addFixedInteger(PlainDate.DAY_OF_MONTH, 2)
      .addFixedInteger(PlainTime.ISO_HOUR, 2)
      .addFraction(PlainTime.NANO_OF_SECOND, 6, 6, false)
      .build();
  PlainTimestamp tsp = formatter.parse("2000022917123456");
  System.out.println(tsp); // 2000-02-29T17:00:00.123456

Handling of ordinal numbers

Some languages use special letter indicators as suffix of a numerical element. A well-known example is English. You can format or parse the day of month as ordinal as follows:

  ChronoFormatter<PlainDate> formatter =
    ChronoFormatter
      .setUp(PlainDate.class, Locale.ENGLISH)
      .addEnglishOrdinal(PlainDate.DAY_OF_MONTH)
      .addLiteral(" of ")
      .addText(PlainDate.MONTH_OF_YEAR)
      .addLiteral(" ")
      .addInteger(PlainDate.YEAR, 4, 9)
      .build();
  PlainDate date = formatter.parse("21st of March 2015");
  System.out.println(date); // 2015-03-21
  System.out.println(formatter.format(date)); // 21st of March 2015

Partial formatting and the class ElementPosition

UI-applications have sometimes need for partial formatting that is extracting the information where in the formatted output any chronological element starts and end. Some print()-methods of the class ChronoFormatter return sets of element positions. This is especially useful if the formatted width cannot be predicted because it depends on variable width elements and/or localization. An example might be displaying the day-of-week-element in red text color if the date is Friday, the 13th:

  ChronoFormatter<PlainDate> formatter = // type-cast okay if i18n-module is loaded
    (ChronoFormatter<PlainDate>) PlainDate.localFormatter(DisplayMode.FULL);
  StringBuilder buffer = new StringBuilder();
  PlainDate today = SystemClock.inLocalView().today();
  Set<ElementPosition> positions = formatter.print(today, buffer);

  if (
    (today.get(PlainDate.DAY_OF_WEEK) == Weekday.FRIDAY)
    && (today.getDayOfMonth() == 13)
  ) {
    for (ElementPosition pos : positions) {
      if (pos.getElement() == PlainDate.DAY_OF_WEEK) {
        int start = pos.getStartIndex();
        int end = pos.getEndIndex();
        String startTag = "<span style=\"color:#FF0000;\">";
        String endTag = "</span>";
        buffer.insert(start, startTag);
        buffer.insert(end + startTag.length(), endTag);
        break;
      }
    }
  }

  String html = "<code>" + buffer.toString() + "</code>";

The feature of partial formatting is equivalent to the former AttributedCharacterIterator in the JDK-class java.text.Format. Time4J can therefore offer a conversion to the old format style preserving attributed format informations alias element positions.

Combining formatters

Combining date and time formatters is easy due to the fact that chronological elements can have any type V, also types like PlainDate etc.

        ChronoFormatter<PlainTimestamp> formatter =
            ChronoFormatter.setUp(PlainTimestamp.class, Locale.ROOT)
                .addCustomized(PlainDate.COMPONENT, Iso8601Format.EXTENDED_CALENDAR_DATE)
                .addLiteral('T', ' ') // using space as alternative literal in parsing
                .addCustomized(PlainTime.COMPONENT, Iso8601Format.EXTENDED_WALL_TIME)
                .build();
        assertThat(formatter.parse("2015-05-13T17:45"), is(PlainTimestamp.of(2015, 5, 13, 17, 45)));
        assertThat(formatter.parse("2015-05-13 17:45"), is(PlainTimestamp.of(2015, 5, 13, 17, 45)));

How to format/parse historical dates?

If an era element is present in the ChronoFormatter then the format engine switches to usage of historized dates as soon as the format locale contains a country-part or if you explicitly set the history. Due to the fact that every country has its own history of gregorian calendar reforms, Time4J offers some methods to configure the history to be used. Example for Sweden which had used its special calendar version one day further than julian calendar between 1700 and 1712 (see also Wikipedia):

        ChronoFormatter<PlainDate> formatter = ChronoFormatter.ofDatePattern("d. MMMM yyyy GGGG", PatternType.CLDR, new Locale("sv");
	String s = formatter.with(ChronoHistory.ofSweden()).format(PlainDate.of(1712, 3, 11));
	System.out.println(s); // 30. februari 1712 efter Kristus

If you explicitly call with(ChronoHistory) then the era might be left out for printing depending on the format. But parsing of historical dates always requires the era although you can - of course - apply a default era like HistoricEra.AD on the formatter.

	ChronoHistory history = ChronoHistory.ofSweden();
        ChronoFormatter<PlainDate> formatter = 
		ChronoFormatter.ofDatePattern("d. MMMM yyyy", PatternType.CLDR, new Locale("sv")
		.with(history)
		.withDefault(history.era(), HistoricEra.AD);
	PlainDate date = formatter.parse("30. februari 1712"));
	System.out.println(date); // 1712-03-11