Flatten HTML

When creaing rich content editor, working on array is easier than working on trees like the DOM. This page presents a JavaScript function that convert the DOM into an array of object. Each object is a leaf of the DOM (leaf of the tree).

The function takes a node element (the root node) as input and produces an array composed of objects:

To prevent weird behaviors with the carriage returns, I use the CSS style white-space: pre-wrap; on the root node.

Here is the result (look at the array in the console):

Here is the full JavaScript function:

// Convert nested HTML info flatten array
function flattenHtml (node, flat=[], tagsList = [])
{
    // Add the current tag
    tagsList.push(node)

    // Check if it is a leaf or not
    if (!node.childNodes.length)
    {
        // Calculate the node index
        let index = (flat[flat.length -1] === undefined) ? 0 : flat[flat.length -1].index + flat[flat.length -1].length;
        // Push the node in the array
        flat.push( { index: index, length: node.length ?? 1, text: node.wholeText ?? '', parents: [...tagsList] });
    }
    else
    {
        // Call the function recursively on each child
        node.childNodes.forEach((child) => { flat = flattenHtml(child, flat, tagsList); })
    }
    // Remove the current tag
    tagsList.splice(tagsList.indexOf(node),1); 
    return flat;
}

The result should look like:

HTML DOM flatten to array

See also


Last update : 04/12/2020