Enum member isLexer

Checks whether its argument fulfills all requirements to be used as an XML lexer.

An XML lexer is the first component in the parsing chain. It masks from the parser the shape of the input and the type of the characters in it. The slices returned by the lexer are ephemeral: every reference to them may or may not be invalidated when a new slice is requested by the parser. It is thus responsibility of the user to copy the output if necessary.


enum isLexer(L) = is(typeof((inout int = 0) { alias C = L.CharacterType; L lexer; char c; bool b; string s; C[] cs; b = lexer.empty; lexer.start(); cs = lexer.get(); b = lexer.testAndAdvance(c); lexer.advanceUntil(c, b); lexer.advanceUntilAny(s, b); lexer.dropWhile(s); } ));


L the type to be tested


true if L satisfies the XML lexer specification here stated; false otherwise


A lexer shall support at least these methods and aliases:

  • alias CharacterType: the type of a single source character; most methods will deal with slices of this type;
  • alias InputType: the type of the input which is used to feed this lexer;
  • void setSource(InputType): sets the input source for this lexer; the lexer may perform other initialization work and even consume part of the input during this operation; after (partial or complete) usage, a lexer may be reinitialized and used with another input by calling this function;
  • bool empty(): returns true if the entire input has been consumed; false otherwise;
  • void start(): instructs the lexer that a new token starts at the current positions; the next calls to get will retrive the input from the current position; this call may invalidate any reference to any slice previosly returned from get
  • CharacterType[] get(): returns the contents of the input going from the last call to start till the current position;
  • bool testAndAdvance(CharacterType): tests whether the input character at the current position matches the one passed as parameter; if it is the case, this method returns true and advances the input past the said character; otherwise, it returns false and no action is performed;
  • void advanceUntil(CharacterType, bool): advances the input until the given character is found; if the second parameter is true, the input is then advanced past the found character;
  • void advanceUntilAny(CharacterType[], bool): advances the input until any of the given characters is found; if the second parameter is true, the input is then advanced past the found character;
  • void dropWhile(CharacterType[]): advances the input until a character different from the given ones is found; the characters advanced by this method may or may not be included in the output of a subsequent get; for this reason, this method should only be called immediately before start, to skip unneeded characters between two tokens.


/* extract a word surrounded by whitespaces */
auto getWord(L)(ref L lexer)
    if (isLexer!L)
    // drop leading whitespaces
    lexer.dropWhile(" \n\r\t");

    // start building the word

    // keep advancing until you find the trailing whitespaces
    lexer.advanceUntilAny(" \n\r\t", false);

    // return what you found
    return lexer.get;

/* extract a key/value pair from a string like " key : value " */
auto getKeyValuePair(ref L lexer)
    if (isLexer!L)
    // drop leading whitespaces
    lexer.dropWhile(" \n\r\t");

    // here starts the key, which ends with either a whitespace or a colon
    lexer.advanceUntilAny(" \n\r\t:", false);
    auto key = lexer.get;

    // skip any spaces after the key
    lexer.dropWhile(" \n\r\t");
    // now there must be a colon
    // skip all space after the colon
    lexer.dropWhile(" \n\r\t");

    // here starts the value, which ends at the first whitespace
    lexer.advanceUntilAny(" \n\r\t", false);
    auto value = lexer.get;

    // return the pair
    return tuple(key, value);


Lodovico Giaretta


Copyright Lodovico Giaretta 2016 --


Boost License 1.0.