Lexes the UTF-8 bytes into a list of lexemes.
(lexemize-leo-from-bytes leo-bytes) → leo-lexemes
A lexeme is a token, comment, or whitespace.
If the input cannot be fully lexed, a reserrp is returned.
Function:
(defun lexemize-leo-from-bytes (leo-bytes) (declare (xargs :guard (nat-listp leo-bytes))) (let ((__function__ 'lexemize-leo-from-bytes)) (declare (ignorable __function__)) (b* (((unless (unsigned-byte-listp 8 leo-bytes)) (reserrf (cons :invalid-octets leo-bytes))) (codepoints (acl2::utf8=>ustring leo-bytes)) ((unless (nat-listp codepoints)) (reserrf (cons :invalid-utf-8 leo-bytes)))) (lexemize-leo codepoints))))
Theorem:
(defthm tree-list-resultp-of-lexemize-leo-from-bytes (b* ((leo-lexemes (lexemize-leo-from-bytes leo-bytes))) (abnf::tree-list-resultp leo-lexemes)) :rule-classes :rewrite)