OpenCV and AHS Heatmap | Working with SVGs (Part 2)

This is part 2 of a series of blog posts about the computer vision behind the AHS Heatmap. Read part one here

When I was experimenting with SVGs and OpenCV, I realized that I would have to convert my floor plans to PNGs to perform computer-vision-based analysis to detect rooms. This CV-based room-detection algorithm was not something I realized early into the project - it was maybe my third or fourth idea, but it certainly was my best and most-thought-out one.

Problem 1: Extracting Room Number Text Box Coordinates

In order to detect the shape of the room around a room number, I would first have to determine where that room number was in the converted PNG. PNGs, unlike SVGs, do not have particular “selectable” text elements - PNGs are simply pixels that make no distinction between lines, shapes, text, or anything else. The SVG floor plan I received, however, stored its room numbers in a rather unapproachable manner, placing them in a seemingly-random order throughout the file on a character-by-character basis. This would make it very difficult to distinguish programmatically which character corresponded to which digit of which room number in which location, so I found myself faced with the challenge of devising a separate method of determining which rooms were on a map, then determining where on the map their room numbers were.

I decided to brainstorm other types of files that distinguish the text contained within them. PDFs came to mind, and luckily Inkscape’s command-line tool can convert SVGs not only to PNGs, but also to PDFs!

I will not go into the details of the PDF text-box-coordinate-extracting algorithm in this post, as there is already a post on my blog dedicated to it. However, I will explain how I used Inkscape to perform the conversions: I used Python’s built-in os package to call os.system(), which executes commands. My SVG conversion code is below:

from pathlib import Path
from sys import platform
import os


def svg_to_png(svg_path, png_path, dpi):
    # Delete the file if it exists, as inkscape won't overwrite
    remove_file(png_path)

    # run the command to convert the svg, adding '> /dev/null' on the end to silence the output by storing it into null
    options = '--without-gui --export-area-page --export-background="#ffffff"'

    if platform == "linux" or platform == "linux2":
        # run command to convert the svg, adding '> /dev/null' on the end to silence the output by storing it into null
        os.system('inkscape %s "%s" --export-dpi=%s --export-png="%s"  > /dev/null' % (options, svg_path, dpi, png_path))
    else:
        os.system(
            'inkscape %s "%s" --export-dpi=%s --export-png="%s"' % (options, svg_path, dpi, png_path))

    wait_for_creation(png_path)


def svg_to_pdf(svg_path, pdf_path):
    # Delete the file if it exists, as inkscape won't overwrite
    remove_file(pdf_path)

    options = '--without-gui --export-area-page'

    if platform == "linux" or platform == "linux2":
        # run command to convert the svg, adding '> /dev/null' on the end to silence the output by storing it into null
        os.system('inkscape %s "%s" --export-pdf="%s" > /dev/null' % (options, svg_path, pdf_path))
    else:
        os.system('inkscape %s "%s" --export-pdf="%s"' % (options, svg_path, pdf_path))

    wait_for_creation(pdf_path)


def remove_file(path):
    try:
        os.remove(path)
    except OSError:
        pass


def wait_for_creation(path):
    while not Path(path).is_file():
        continue  # Wait until it's completed

Something worth noticing is my use of the --export-dpi flag in the SVG to PNG conversion command. By specifying the proper DPI, I could result in PDFs and PNGs of matching pixel dimensions. Although not necessary, this was very useful to using OpenCV to detect rooms, as I did not have to scale the coordinates extracted from the PDF for the location of the text boxes to match up across files. Thus, I could simply use the coordinates of the PDF as coordinates on the PNG, and the extracted text box positions were guaranteed to match up with those of the image OpenCV was analyzing (this was essential to the floodfill algorithm described in part 3).

Problem 2: SVG to PNG Coordinates

Although both SVGs and PNGs can use pixel coordinate systems, the coordinates of a point on a floor plan’s SVG would be nowhere near the coordinates of the same point on the converted PNG.

Vast disparity between PNG and SVG coordinate systems

Here, "on screen" means in relation to the converted PNG.

This was a serious problem because it meant that any room coordinates detected by OpenCV would be unusable when overlaying the colored rectangles on the filled SVG (see part 3).

Eventually, I discovered two important things:

When creating SVG elements, the y axes of PNGs and SVGs start at different points (one starts from the bottom of the screen, the other starts from the top of the screen)
One can calculate what SVG coordinate a particular PNG coordinate corresponds to by forming a ratio that uses the SVG’s view box dimensions.

By combining these two discoveries, I wrote the following Python code that converts PNG coordinates to SVG coordinates:

def get_svg_coords(png_coords, svg_view_box, pdf_media_box):
    x_coord = png_coords[0] * svg_view_box[2] / pdf_media_box[2]
    y_coord = svg_view_box[3] - png_coords[1] * svg_view_box[3] / pdf_media_box[3]
    coords = [x_coord, y_coord]
    return tuple(coords)