hyperformula icon indicating copy to clipboard operation
hyperformula copied to clipboard

Global settings

Open wojciechczerniak opened this issue 6 years ago • 8 comments

Description

There are some differences between spreadsheet applications, and some of them can be compensated with customizable options.

Done Option Name Default Description
undoLimit 20 #295
maxRows 40,000 #294 for #198
maxColumns 18,278 #294 for #198
licenseKey undefined #222
nullDate 1900 Date is stored as a number of days since this date. There are a few "date systems" in the world of spreadsheets: 1900 Date System, 1904 Date System [1] #3
nullYear 30 If date contains two-digit year it will be parsed as 19xx for numbers above this setting and 20xx for numbers below i.e 2029 but 1930. ODFF section 6.10.23
leapYear1900 false Callback that will allow handling leap year compatibility problem (introduced in Lotus 1-2-3) in various spreadsheets [2] #3
decimalSeparator . Number format #42
thousandSeparator `` Number format #42
language enGB Formula translations
accentSensitive false String comparison #130
caseFirst "lower" String comparison #130
ignorePunctuation false String comparison #130
localeLang "en" String comparison #130
caseSensitive false Comparison operators =, <>, <, <=, >, and >= can be either case sensitive or insensitive. Exposed as an option in many spreadsheet apps
intersectionOperator (space) ODFF section 5.5 and 6.4.12, default is ! but some implementations (XL, GS) use a space character. Example =SUM([.A1:.C4]![.B1:.B5]) vs =SUM(A1:C4 B1:B5) [3] #15
unionOperator , ODFF section 6.4.13, default is ~ but some implementations use comma , or + character. Example =SUM([.A1:.C4];[.B1:.B5]) vs =SUM(A1:C4, B1:B5) [3] #15
rangeOperator : Optional. There are no other, known implementations.
sheetSeparator . In ODFF the dot . is a part of cell reference so there is no such thing as sheet separator (?), but most of the implementations ommit dots and it exist only when worksheet reference is needed
functionArgSeparator ; Default for ODFF is ; semicolon, mentioned in section 5.6 and 6.2. Might be a comma , in some implementations (GS 👀 ) #15
useWildcards true #220
useRegularExpresssions false #220
matchWholeCell true #220
precisionEpsilon 1e-13 Controls epsilon used with comparisons between numbers, and snap-to-zero behaviour #4
precisionRounding 14 Controls number of digits the number is rounded with when API function getCellValue() #4
smartRounding false Flag for snap-to-zero, rounding and safe-comparison (if true then those heuristics are used). #4
parseDateTime Callback that will allow to add more date formats and use any date library that developer prefers
stringifyDateTime Callback that will allow to add more date formats and use any date library that developer prefers

Where the defaults should be roughly compatible with XL implementation.

Host Defined Behaviors

ODFF [4] defines host defined behaviors in section 3.4. Some of them were already mentioned above. The full list is as follows:

The following properties are host-defined:

  1. HOST-CASE-SENSITIVE: if true, text comparisons are case-sensitive. This influences the operators =, <>, <, <=, >, and >=, as well as database query functions that use them. Note that the EXACT function is always case-sensitive, regardless of this calculation setting.
  2. HOST-PRECISION-AS-SHOWN: If true, calculations are performed using rounded values of those displayed; otherwise, calculations are performed using the precision of the underlying numeric representation. Note: This does not impose a particular numeric model. Since implementations may use binary representations, this rounding may be inexact for decimal values.
  3. HOST-SEARCH-CRITERIA-MUST-APPLY-TO-WHOLE-CELL If true, the specified search criteria shall apply to the entire cell contents if it is a text match using = or <>; if not, only a subpart of the cell content needs to match the text.
  4. HOST-AUTOMATIC-FIND-LABELS: if true, row and column labels are automatically found.
  5. HOST-USE-REGULAR-EXPRESSIONS: If true, regular expressions are used for character string comparisons and when searching.
  6. HOST-USE-WILDCARDS: If true, wildcards question mark '?' and asterisk '*' are used for character-string comparisons and when searching. Wildcards may be escaped with a tilde '~' character.
  7. HOST-NULL-YEAR: This defines how to convert a two-digit year into a four-digit year. Each two-digit year value is interpreted as a year that equals or follows this year.
  8. HOST-NULL-DATE: Defines the beginning of the epoch; a numeric date of 0 equals this date.
  9. HOST-LOCALE: The locale to be used for locale-dependent operations, such as conversion of text to dates, or text to numbers.
  10. HOST-ITERATION-STATUS: If enabled, iterative calculations of cyclic references are performed.
  11. HOST-ITERATION-MAXIMUM-DIFFERENCE: If iterative calculations of cyclic references are enabled, the maximum absolute difference between calculation steps that all involved formula cells must yield for the iteration to end and yield a result.
  12. HOST-ITERATION-STEPS: If iterative calculations of cyclic references are enabled, the maximum number of steps iterations that are performed if the results are not within HOST-ITERATION-MAXIMUM-DIFFERENCE. The function HOST-REFERENCE-RESOLVER(Reference) is implementation defined. This function takes as input a Unicode string containing a Reference according to section 4.8 and returns a resolved value.

This might be used as an inspiration.

References

[1] https://docs.microsoft.com/en-us/office/troubleshoot/excel/1900-and-1904-date-system [2] https://docs.microsoft.com/en-us/office/troubleshoot/excel/determine-a-leap-year [3] https://exceloffthegrid.com/cell-ranges-basic-things-99-users-dont-know/ [4] https://www.oasis-open.org/committees/documents.php?wg_abbrev=office [5] https://help.libreoffice.org/6.3/en-US/text/shared/optionen/01060500.html?DbPAR=SHARED#bm_id3149399

wojciechczerniak avatar Dec 04 '19 11:12 wojciechczerniak

Very good idea from @swistak35 https://github.com/handsontable/hyperformula/commit/37c1d1d6412727d1358e10b1d66144120904abba#commitcomment-36283037, we should validate customized separators and operators to avoid ambiguity in the parser.

wojciechczerniak avatar Dec 05 '19 12:12 wojciechczerniak

Added to config as variables:

precisionEpsilon, 1e-13, Controls epsilon used with comparisons between numbers, and snap-to-zero behaviour. precisionRounding, 14, Controls number of digits the number is rounded with when API function getCellValue() smartRounding, true, Flag for snap-to-zero, rounding and safe-comparison (if true then those heuristics are used).

izulin avatar Jan 23 '20 16:01 izulin

How is nullDate default being 1900 consistent with us right now counting the dates from 30/12/1899? Shouldnt there be more parameters?

izulin avatar Feb 04 '20 13:02 izulin

How is nullDate default being 1900 consistent with us right now counting the dates from 30/12/1899? Shouldnt there be more parameters?

Depends. It may be a preset or we can allow to set 01/01/1904 or 30/12/1899 or 31/12/1899 and explain in the docs why. This also depends on leapYear1900 so the original idea was the preset

if (nullDate === 1900) {
  if (leapYear1900 === true) {
     this.nullDate = { year: 1899, month: 12, day: 30 }
  } else {
     this.nullDate = { year: 1899, month: 12, day: 31 }
  }
} else if (nullDate === 1904) {
     this.nullDate = { year: 1903, month: 12, day: 31 }
} else { 
  throw new Error('Invalid preset');
}

but maybe you have a better idea @izulin. There might be an edge case that I forgot.

nullDate can be a {year: number, month: number, day: number} type. Just take under consideration the 1900 leap year bug/compatibility.

Evaluators shall support all dates from 1904-01-01 through 9999-12-31 (inclusive) in calculations, should support dates from 1899-12-30 through 9999-12-31 (inclusive) and may support a wider date range. Note 1: Using expressions that assume serial numbers are based on a particular epoch may cause interoperability issues. Evaluators shall support positive serial numbers. Evaluators may support negative serial numbers to represent dates before an epoch. Note 2: It is implementation-defined if the year 1900 is treated as a leap year. Note 3: Evaluators that treat 1900 as a non-leap year can use the epoch date 1899-12-30 to compensate for serial numbers that originate from evaluators that treat 1900 as a leap year and use 1899-12-31 as an epoch date.

Source: https://docs.oasis-open.org/office/OpenDocument/v1.3/csprd02/part4-formula/OpenDocument-v1.3-csprd02-part4-formula.html#__RefHeading__1017888_715980110

wojciechczerniak avatar Feb 04 '20 14:02 wojciechczerniak

Implemented: excelDateCompatibility: boolean zeroDate: {year: number, mont: number, day: number} nullYear: value separating 19xx from 20xx in parsing https://github.com/handsontable/hyperformula/pull/141/commits/9f6aed619e1b5ef17973937cfb22ead364e1e16a

izulin avatar Feb 05 '20 12:02 izulin

Note: I have updated the list with updates from @izulin. Added decimalSeparator #42 and removed culture which was replaced by it.

wojciechczerniak avatar Feb 18 '20 14:02 wojciechczerniak

There are some settings in Config class that will need documentation

@scarletfog we need to ask @bardek8 @izulin @voodoo11 @swistak35 for descriptions and their influence on the results

    chooseAddressMappingPolicy: new AlwaysDense(),
    functionArgSeparator: ',',
    functionPlugins: [],
    gpuMode: 'gpu',
    matrixDetection: true,
    matrixDetectionThreshold: 100,
    useColumnIndex: false,
    vlookupThreshold: 20,

wojciechczerniak avatar Feb 19 '20 10:02 wojciechczerniak

maxColumns, maxRows added in #294 licenseKey added in #293

wojciechczerniak avatar Apr 20 '20 09:04 wojciechczerniak