Informats
The DLMReader package uses informat to call a special type of functions on raw text before parsing its values. This gives a flexible feature to DLMReader and enables it to handle messy delimited files.
The package is shipped with some pre-registered informats which are listed (new informats may be added to future releases) below:
STRIP!: Remove leading and trailing blanksCOMMA!: Remove$, and,(thousands separators) from the numbersCOMMAX!: Treat,as decimal point, and removes€, and.(thousands separators) from numbersNA!: TreatNA,na,Na,nAas missingBOOL!: Convert any form ofTrueandFalseto1and0, respectivelyACC!: Treat numbers in parentheses (Accounting format) as negative values, i.e. it replaces(with-and)with blank.COMPRESS!: Remove all blanks (0x20)
Users can define their own informats, which is basically a function with one positional argument. The function must accept a special mutable string and returns its modified value (or returns a subset of it). To use the new defined function as informat, user must register it by calling the register_informat function.
Note that if the function definition is changed,
DLMReaderuses the new definition and user MUST re-register it to avoid extra allocation.
Supported string manipulation functions
The function used for informat must accept a special mutable string type and return it. Thus, user must only modify the input argument in-place. Currently, few string manipulation functions are optimised for this purpose - For small to medium size files, user can use any other functions, which support working with AbstractString, for manipulating the input argument: (in the following paragraph x is referring to the positional argument of the user defined informat)
- isequal: User can use 
isequalto check if the input function or a sub-string of it is equal to a string. - setindex!: To assign a string to the input argument, use 
setindex!(x, "newtext")syntax. If the length of new text is smaller thanx, it will be padded with blank (0x20), if it is longer thanxit will be truncated. - replace!: User can use 
replace!to replace part of the input argument. For instancereplace!(x, "12"=>"21")replace every occurrence of "12" with "21". Note that shorter replacing text will be padded with blank and longer ones will be truncated. - occursin: 
occursin("text", x)determines whether the first argument is a substring of the second.- User can use regular expressions too, i.e. 
occursin(r::Regex, x) 
 - User can use regular expressions too, i.e. 
 - contains: 
contains(x, "text")determines whether the second argument is a substring of the first.- User can use regular expressions too, i.e. 
contains(x, r::Regex) 
 - User can use regular expressions too, i.e. 
 - findfirst: User can search for the first occurrence of the index of a regular expression pattern via the allocation free function 
findfirst, i.e.findfirst(r::Regex, x) - findnext: User can search for the next occurrence of the index of a regular expression pattern vai the allocation free function 
findnext, i.e.findnext(r::Regex, x, i) - remove!: It removes some text (replace it with blanks) from the input argument. It can removes a 
Stringconstant or a sequence of string location provided as a range, e.g.remove!(x, "text")remove!(x, 2:4)
 
Users can test the new defined informats by calling
DLMReader.test(f, text)wherefis the user defined informat andtextis a given string.