MicroGPT.ts - a conversion of Karpathy's MicroGPT to Typescript
    Preparing search index...

    Function tokenize

    • Tokenize a document string into an array of token ids, surrounded by BOS.

      Parameters

      • doc: string | undefined

        The document string.

      • uchars: string[]

        Sorted unique characters (from buildTokenizer).

      • BOS: number

        The BOS token id.

      Returns number[]

      Array of token ids: [BOS, ...charIds, BOS].