Advanced Type Annotations
When writing JavaScript, it can be hard to keep track of the signatures of functions. TypeScript is supposed to help with this problem, but sometimes TypeScript can get in the way of what you’re trying to do.
It is often convenient not to have to go through a compile step to run your code. Especially in Node.js, avoiding code compilation makes troubleshooting easier because your stack traces match your source code without relying on source maps.
When we wrote the ttj-client library, we knew we wanted to have the best type annotations possible, but we did not want to have to rely on TypeScript.
Enter JSDoc Type Annotations
Luckily, TypeScript supports JSDoc type annotations. We can write regular JavaScript and just add the type definitions in JSDoc comments. Here’s a simple example:
/**
* @param {number} a
* @param {number} b
* @returns {number}
*/
function add(a, b) {
return a + b;
}
That’s simple enough. However, things get tricky if you want to use more advanced types like generics, and how does exporting and importing types from other files work?
Importing Types from other Files
To import types from other files, you can use the import
keyword in JSDoc comments. Here’s an example:
/**
* @param {import('../db/models/ocrjobs').OcrJob} job
*/
function processJob(job) {
// do something with the job
}
Exporting Types
You don’t need to do anything special to make your types available to other files. Just define them in a JSDoc comment, and they will be available to other files. Here’s an example:
/**
* @typedef {{
* id: string,
* status: 'pending' | 'processing' | 'finished' | 'failed',
* retrycount: number,
* started?: Date,
* finished?: Date,
* error?: string,
* parsingsteps?: { name: string, maxcount?: number, type: string, skip?: number }[],
* schema: any,
* result?: any,
* timezone?: string,
* returnprobabilities?: boolean,
* codeData?: { type: string, decoded: string, raw: string, position: number[][] }[]
* }} OcrJob
*/
Advanced Generics
Sometimes, you want to define dynamic data types. For example, in the ttj-client library, in the TTJClient#inferDocumentBySchema
function, the return value depends on the returnprobabilities
parameter. If the returnprobabilities
parameter is set to true
, we return an array of the form { value: any, probability: number }[]
for every property of the schema. If it is set to false
, we return the value directly (which has type ‘any’).
Because we support nested objects and arrays, we first need to define a recursive @callback
function type for both cases.
First, we deal with the case where returnprobabilities
is set to false
:
/**
* @template T
* @callback TTJParsingFunction
* @param {T} schema
* @returns {{ [K in keyof T]?: T[K] extends {} ? ReturnType<TTJParsingFunction<T[K]>> : T[K] }}
* */
This is essentially a recursive Partial<T>
type. We first need to define a generic type T
using the @template
annotation. Then we define a @callback
function type TTJParsingFunction
that takes a T
. The return type is an object with the same keys as T
, but every key is optional. If the value of the key is an object, we recursively call the TTJParsingFunction
type on it.
Next up, we deal with the case where returnprobabilities
is set to true
:
/**
* @template T
* @callback ReturnProbabilitiesFunction
* @param {T} schema
* @returns {T extends string ? { value: any, probability: number }[] : T extends infer U ? ReturnType<ReturnProbabilitiesFunction<U>>[] : T extends {} ? { [K in keyof T]?: ReturnType<ReturnProbabilitiesFunction<T[K]>> } : { value: any, probability: number }[]}
* */
Here, we also define a generic type T
using the @template
annotation. We define a @callback
function type ReturnProbabilitiesFunction
that takes a T
. As we know that every property of our schema is defined as a string, we can check if T
is a string by using a ternary expression and the extends
keyword and return the { value: any, probability: number }[]
type. Otherwise, we check if T
is an array. We can declare a new generic variable with the type of the array elements using the infer
keyword and use it to call the ReturnProbabilitiesFunction
type recursively. If T
is an object, similar to the TTJParsingFunction
type, we return an object with the same keys as T
, but every key is optional.
Finally, we define the inferDocumentBySchema
function:
/**
*
* @template S
* @template {boolean} R
* @param {Buffer|Uint8Array|string} data A PDF, PNG, or JPEG file as a buffer, Uint8Array, or data URL
* @param {'application/pdf'|'image/png'|'image/jpeg'} mimetype The mimetype of the data
* @param {S} schema
* @param {(TextParsingStep|ImageParsingStep)[]=} parsingsteps
* @param {R=} returnprobabilities
* @returns {Promise<{
* results: R extends true ? ReturnType<ReturnProbabilitiesFunction<S>> : ReturnType<TTJParsingFunction<S>>,
* ...
* }>}
* }
* */
async inferDocumentBySchema(data, mimetype, schema, parsingsteps, returnprobabilities) {
//...
}
Here, we define two generic types S
and R
. S
is the schema type, and R
is a boolean that determines if we return probabilities or not. To check if returnprobabilities
is set to true
, we use the R extends true
expression and return the ReturnType<ReturnProbabilitiesFunction<S>>
type. Otherwise, we return the ReturnType<TTJParsingFunction<S>>
type.
Complex object types
In cases like the parsingsteps
parameter, we need to differentiate between two lists of large language models that can handle different types of data. For example, gemini-pro-vision can handle image data while gpt-3.5-turbo cannot. Therefore, if a ParsingStep
has type: "raw"
or type: "padded"
, it can’t have name: "vertex/gemini-1.0-pro-vision-001"
. To solve this, we can introduce a union type:
/**
* @typedef {'openai/gpt-3.5-turbo'|'openai/gpt-4'|'azure/gpt-35-turbo'|'vertex/text-bison@001'|'ollama/mixtral'|'ollama/llama2'|'ollama/llama2:13b'|'ollama/gemma'} SupportedLanguageModel
* @typedef {SupportedLanguageModel | 'vertex/gemini-1.0-pro-vision-001'} SupportedVisionModel
*/
/**
* @typedef {{
* type: 'raw' | 'padded',
* name: SupportedLanguageModel,
* maxcount?: number
* }} TextParsingStep
*/
/**
* @typedef {{
* type: 'image',
* name: SupportedVisionModel,
* maxcount?: number
* }} ImageParsingStep
*/
/**
* ...
* @param {(TextParsingStep|ImageParsingStep)[]=} parsingsteps
* ...
*/
This is called a discriminated union type. If the type
property is set to raw
or padded
, the name
property can only be of type SupportedLanguageModel
. If the type
property is set to image
, the name
property can only be of type SupportedVisionModel
. In this case, SupportedVisionModel
is a superset of SupportedLanguageModel
, and therefore, we effectively just allow more options for the name property if the type
is set to image
.
Enabling JSDoc Type checking in VSCode
VSCode has built-in support for JSDoc type annotations with type checking and autocompletion.
Add a jsconfig.json
file to your project root with the following content:
{
"compilerOptions": {
"allowJs": true,
"checkJs": true
},
"exclude": [
"node_modules"
]
}
Maybe add more excludes if you have more folders you don’t want to check as this can slow down your editor.
Updates to this post:
- 2024-04-23: Clarify the justification for using JSDoc type annotations instead of TypeScript.