*Like my content?
Sign up to get occasional emails
about new blog posts and other content. *

## Decoding Variable Length Quantity for Source Maps

This is the first article in a series on source maps. We will be building an app to show the mapping between some TypeScript code and the compiled JavaScript using source maps. In order to understand exactly how everything works, instead of using libraries like `source-map`

or `vlq`

, we will write our own decoder and parser from scratch!

You can watch a video version of this post on my Youtube channel.

Here are some useful resources I used for this article:

- Source Map Visualizer by Tobias Koppers of Webpack fame
- Source Map Visualizer by Evan Wallace of Figma fame
- VLQ Source Code by Rich Harris of Rollup fame
- The actual "standard" documentation which is actually just a Google doc - turns out there is no official "spec" as such
- Wikipedia Base64 article
- Wikipedia VLQ article

## What is a Source Map?

Let's say we have this TypeScript code:

```
const greet = (name: string) => {
return `Hello ${name}`
}
```

If you compile it and ask for a source map with `yarn tsc greet.ts --sourceMap`

, you get both the compiler JavaScript (`greet.js`

):

```
var greet = function (name) {
return "Hello " + name;
};
//# sourceMappingURL=greet.js.map
```

...and the source map (`greet.js.map`

):

```
{
"version": 3,
"file": "greet.js",
"sourceRoot": "",
"sources": [
"greet.ts"
],
"names": [],
"mappings": "AAAA,IAAM,KAAK,GAAG,UAAC,IAAY;IACzB,OAAO,WAAS,IAAM,CAAA;AACxB,CAAC,CAAA"
}
```

The main thing we are interested in is `mappings`

:

```
"AAAA,IAAM,KAAK,GAAG,UAAC,IAAY;IACzB,OAAO,WAAS,IAAM,CAAA;AACxB,CAAC,CAAA"
```

This incredibly compact jumble of letters tells us that `var`

in `greet.js`

corresponds to `const`

in `greet.ts`

, as well as how the rest of it maps up... if we can decode it.

## Variable Length Quantity

These letters are variable length quantity - a very concise way of encoding large numbers. To hint at where this is all leading, if you decode `AAAA`

, you get an array of numbers: `[0, 0, 0, 0]`

. `IAAM`

gives us `[4, 0, 0, 6]`

. The next article will go in depth on what each of these numbers means, but basically they map a row and column in the compiled JavaScript to the original TypeScript:

This brings us to the goal of this post: decoding the VLQs to arrays of numbers.

## Segments, Fields and Base 64

The `mappings`

property has many segments, divided up by `,`

. Each one has several fields. `AAAA`

maps to `[0, 0, 0, 0]`

- which has four fields. Each line is separated by a `;`

. Our mappings field has two `;`

- three lines total, which matches up to the compiled JavaScript. The source map always maps from the compiled code to the original code - not the other way around. This means the number of lines represented in the source map will always be the same as the number of lines in the compiled code.

What we are dealing with are base 64 encoded VLQs. According to the standard:

The VLQ is a Base64 value, where the most significant bit (the 6th bit) is used as the continuation bit, and the “digits” are encoded into the string least significant first, and where the least significant bit of the first digit is used as the sign bit.

Decoding `A`

is quite easy, since it is listed in the Base 64 table - it's 0. We can be a bit more thorough in our decoding using the above definition for a VLQ.

A is 0, or in binary, `000000`

. As stated above, the most significant bit (the 6th bit) is used as the continuation bit. In this case the most significant bit (the value on the far left) is 0. This means there is no continuation needed - the number fits into five bits. For larger numbers, this is not the case. We will see an example soon.

It also says the least significant bit of the first digit is used as the sign bit. The least significant bit (the value on the far right) is also 0.

## Encoding Negative Numbers

Let's see an example of a negative number. `J`

is VLQ for -4. Looking at the Base 64 table again, we can see `J`

is 9, or `001001`

in binary. The most significant bit is 0 - so we know the entire number fits within 5 bits. The least significant bit is 1 - that means it's a negative number. We are left with `100`

, which is 4 in decimal. The final decoded value is -4.

## The Continuation Bit

The final example we need to cover is an encoded VLQ that uses a continuation bit. `yB`

decodes to 25. Let's walk through it. Looking at the Base 64 table, we can see `y`

is 50 in decimal, or `110010`

in binary. The most significant bit is a 1 - this means the number requires more than 5 bits to encode. Truncating the leading 1, we are left with `10010`

. `10010`

is 19 in decimal.

```
+-----------------------+
| 19 |
+---+---+---+---+---+---+
| X | 1 | 0 | 0 | 1 | 0 |
+---+---+---+---+---+---+
```

Next is `B`

, which is 1 in decimal or `000001`

in binary. The most significant bit is 0, so we do not need to continue to the next segment.

With this knowledge, `yB`

represented in VLQ looks like this:

```
+-----------------------+-----------------------+
| C | 19 | 1 |
+---+---+---+---+---+---+---+---+---+---+---+---+
| 1 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
+---+---+---+---+---+---+---+---+---+---+---+---+
```

Finally, we need to sum the two numbers. Ignoring the initial continuation bit, we have:

```
+-------------------+-----------------------+
| 19 | 1 |
+---+---+---+---+---+-------+---+---+---+---+
| 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
+---+---+---+---+---+---+---+---+---+---+---+
```

or `10010`

and `000001`

. It's not as simple as 18 + 1 = 19. Referring back to the standard:

The VLQ is a Base64 value, where the most significant bit (the 6th bit) is used as the continuation bit, and

the “digits” are encoded into the string least significant first, and where the least significant bit of the first digit is used as the sign bit.

This means the second value, `000001`

is actually more significant than `10010`

- by five orders of magnitude (in binary), or 31 - `111111`

in binary. This means for each continuation bit we encounter, the subsequent value needs to be increased by 31.

This means the final calculation is 18 (`10010`

) + ( 1 (`000001`

) + 31 (`111111`

) ) = 50.

50? Didn't you say `yB`

decodes to 25? Yes! We are not done yet. The last part of the standard states:

The VLQ is a Base64 value, where the most significant bit (the 6th bit) is used as the continuation bit, and the “digits” are encoded into the string least significant first, and where the

least significant bit of the first digit is used as the sign bit.

This means the final number is not `50`

- which is `110010`

in binary, but `11001`

. The final bit represents the sign - `0`

for positive and `1`

for negative. `11001`

is 25, and it's +25 because the final bit is 0.

```
+--------------------------+
| Value | Sign |
+---+---+---+---+---+------+
| 1 | 1 | 0 | 0 | 1 | 0 |
+---+---+---+---+---+------+
```

The continuation bit is what makes VLQ and source maps complex to understand and decode at first - but with the information above, we are now in a good position to write a `decode`

function!

## Decoding VLQs with JavaScript

Time to write some code. As discussed, VLQ supports representing numbers larger than 31 (`11111`

) using continuation bit(s). This means our solution is going to be recursive, to support arbitrarily large numbers.

Since we are operating with binary representations, we will use JavaScript's built in binary operators heavily in our solution.

Let's start with the same example as used above, `A`

, which decodes to `0`

. Ultimately we want to return an array of numbers, since we are building this decode function for use with source maps, so instead of just decoding `A`

to 0, we will decode `AAAA`

to `[0, 0, 0, 0]`

.

```
function decode(str: string) {
}
console.log(
decode('AAAA')
)
```

The first thing we need to do is separate the first character from the rest. So for `AAAA`

, we want `A`

and `AAA`

.

```
function decode(str: string) {
const [char, ...tail] = str
const rest = tail.join('')
// char => 'A', rest => 'AAA'
}
```

The next thing we need to do is get the Base 64 value for `A`

. The easiest way to get access to a Base 64 -> integer map is simply to hard-code it:

```
const charToInteger: Record<string, number> = {}
'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/='
.split('')
.forEach((char, i) => {
charToInteger[char] = i
})
function decode(str: string) {
const [char, ...tail] = str
const rest = tail.join('')
const integer = charToInteger[char]
}
```

Now things get a little more interesting. We need to see if there is a continuation bit. We can do this using a bitwise `&`

. Performing `x & y`

will return a new binary value where bit is `1`

if both corresponding bits in `x`

and `y`

are `1`

. For example:

```
+---+---+---+---+---+---+
| x | 1 | 1 | 0 | 1 | 0 |
+---+---+---+---+---+---+
| y | 0 | 1 | 0 | 1 | 0 |
|=======================|
| = | 0 | 1 | 0 | 1 | 0 |
+---+---+---+---+---+---+
```

This would return `01010`

. A neat trick is just to do `& 32`

to see if we have a continuation bit. Why does this work? 32 is `100000`

. It will return 0 for any value where the sixth bit is not 1. In this example, `A`

is `000000`

& `100000`

returns `000000`

- 0 in decimal - which of course evaluates to false in JavaScript.

```
function decode(str: string) {
const [char, ...tail] = str
const rest = tail.join('')
const integer = charToInteger[char]
const hasContinuationBit = integer & 32
}
```

If there is not continuation bit, we can just check the least significant bit to see if the value is positive or negative, then return the final value.

```
function decode(str: string) {
const [char, ...tail] = str
const rest = tail.join('')
const integer = charToInteger[char]
const hasContinuationBit = integer & 32
if (hasContinuationBit) {
// handle it
} else {
const isNegative = integer & 1
const finalValue = isNegative ? -(integer >>> 1) : integer >>> 1
return finalValue
}
}
```

We use the bitwise `&`

trick again - this time to see if the final bit is 1 or 0. We then need to discard the final bit, since that only represents the sign, it's not part of the actual value, and return the result. We can do that by using a bitwise right shift, which moves everything to the right, truncating the remaining bits. For example `101 >>> 1`

would become `10`

- we've removed the final bit.

Success! `decode`

returns `0`

. We want to iterate over each character, and return an array of `[0, 0, 0, 0]`

. All we need to do is call `decode`

again, passing in the `rest`

. We will also need to keep track of the decoded values:

```
function decode(str: string, decoded = []) {
const [char, ...tail] = str
const rest = tail.join('')
const integer = charToInteger[char]
const hasContinuationBit = integer & 32
if (hasContinuationBit) {
// handle it
} else {
const isNegative = integer & 1
const finalValue = isNegative ? -(integer >>> 1) : integer >>> 1
if (!rest) {
return decoded.concat(finalValue)
}
return decode(rest, decoded.concat(finalValue))
}
}
```

This gives us the desired result.

## Handling Continuation Bits

Now we need to handle decoding values greater than 31, that use a continuation bit. Let's decode `yB`

, which should return `[25]`

. First the code, than an explanation:

```
function decode(str: string, acc = 0, depth = 0, decoded = []) {
const [char, ...tail] = str
const rest = tail.join('')
const integer = charToInteger[char]
const hasContinuationBit = integer & 32
const withoutContBit = integer & 31
const shifted = (withoutContBit << 5 * depth)
const value = acc | shifted
if (hasContinuationBit) {
return decode(rest, value, depth + 1, decoded)
} else {
const isNegative = value & 1
const finalValue = isNegative ? -(value >>> 1) : value >>> 1
if (!rest) {
return decoded.concat(finalValue)
}
return decode(rest, 0, 0, decoded.concat(finalValue))
}
}
```

A bunch of things are going on here. Starting with the updated signature:

```
function decode(str: string, acc = 0, depth = 0, decoded = []) {
```

We need to keep track of current accumulated value (eg, we decode `y`

, adding it on to `0`

, then we decode `B`

, adding it on to the result of `decode(y) + 0`

from the previous iteration.

We also need a `depth`

variable - for each continuation bit we encounter, we need to add 31 on to the decoded value. That's what is happening here:

```
const hasContinuationBit = integer & 32
const withoutContBit = integer & 31
const shifted = (withoutContBit << 5 * depth)
const value = acc | shifted
```

The `& 31`

effectively truncates the continuation bit - for example if we have 40, which is `101000`

in binary, performing `101000 & 11111`

yields `01000`

. It's just a concise way to truncate the continuation bit.

Finally we have:

```
const shifted = (withoutContBit << 5 * depth)
const value = acc | shifted
```

This effectively sums `acc`

(which is the current sum of all previously decoded values in field) and the current value. To really see this in action, work through an example with a larger number such as `63C`

(1405).

Using the Base 64 table, `6`

is `111010`

in binary. It has a continuation bit.

`acc`

:`0`

`withoutContBit`

:`11010`

`shifted`

:`11010 << 5 * 0 = 11010`

`value`

:`0 | 11010 = 11010`

Next is `3`

which maps to `110111`

. Again, lose the continuation bit. `depth`

is now `1`

:

`acc`

:`11010`

`withoutContBit`

:`10111`

`shifted`

:`10111 << 5 * 1 = 1011100000`

`value`

:`11010 | 1011100000 = 1011111010`

Next is `C`

which maps to `000010`

. Last iteration - there is no continuation bit. `depth`

is now `2`

:

`acc`

:`1011111010`

`withoutContBit`

:`00010`

`shifted`

:`00010 << 5 * 2 = 100000000000`

`value`

:`1011111010 | 100000000000 = 101011111010`

Finally, we see if the final bit is 0 for positive or 1 for negative, truncate it and return the value. In this case it's positive. so we return `+10101111101`

, which gives us 1405. A bit messy, but we did it, and learned a thing or two along the way.

## The Final Implementation

The final implementation is show below. It has a lot of temporary variables for clarity. It could be refactored to be much more concise.

```
const charToInteger: Record<string, number> = {}
'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/='
.split('')
.forEach((char, i) => {
charToInteger[char] = i
})
function decode(str: string, acc = 0, depth = 0, decoded = []) {
const [char, ...tail] = str
const rest = tail.join('')
const integer = charToInteger[char]
const hasContinuationBit = integer & 32
const withoutContBit = integer & 31
const shifted = (withoutContBit << 5 * depth)
const value = acc | shifted
if (hasContinuationBit) {
return decode(rest, value, depth + 1, decoded)
}
const isNegative = value & 1
const finalValue = isNegative ? -(value >>> 1) : value >>> 1
if (!rest) {
return decoded.concat(finalValue)
}
return decode(rest, 0, 0, decoded.concat(finalValue))
}
```

Armed with our `decode`

function, the next article will look at constructing a data structure to relate the compiled JavaScript and the original TypeScript, which will power the final part of the project, a source map visualization.

*Like my content?
Sign up to get occasional emails
about new blog posts and other content. *