On 14th November, 2017 Microsoft’s SeeingAI was released in the UK as well. Acknowledging its outstanding performance in many aspects such as, speed, accuracy of OCR on text, face recognition, and even indoor scene recognition despite Microsoft’s modest approach to its abilities; we were of course curious how it performs on STEM graphics. In our NanoTip, we show two of the images we chose to test SeeingAI with, and the results it gave, and of course the IRIS like description in the caption of the images as a baseline.

Let’s see. The first image on the menu is the iconic quadratic function we use for most tests. Here is what it looks like.

The figure has the title “Graph of a quadratic function with a vertical shift”. The horizontal axis show a displacement in units of meter, while the vertical axis show a force in units of newton. The scale on the horizontal axis have limits of -15 and +15 and the plotted values range from -10 to +10 in steps of 5. Limits on the vertical axis are -10 and +110 and the range of values are plotted between -5 and 95 in steps of 5. The figure shows only one continuous function of a parabola opened upward, plotted with a red dashed line. The quadratic function is shifted downward by 5 units, has no horizontal shift. It intersects the y-axis at [0, -5] and cuts twice the x-axis at the points approximately between -2, -2.5 and +2, +2.5. The minimum of the parabola is at its intersection with the vertical axis.

Now this is what SeeingAI returns when recognising the image saved as a photo on an iPhone 6, running iOS 10.3.

“Text: Graph of a quadratic function with vertical shift
100, 80, 40, 20, —15, —10
Displacement (m)
10, 15

Scene: a close up of a mans face”.

As you can see, the app provides a separate recognition for the text found and the possible scene too. This is brilliant, just as brilliant the OCR accuracy is in the case of a graph. However, in all fairness, we can’t conclude that this is a particularly useful transcription of the data.

Now let’s see what happens with our second example, the water molecule, which looks the following.

The figure has the title “Water molecule” and is based on a white background. At the bottom of the image, the chemical formula H2O can be read. In the centre a red sphere is visible with two additional, half size grey spheres placed at 8 and 4 o’clock relative to the central, bigger sphere. The two grey objects are similar to a distorted sphere where they join the red sphere. The connections are shown as straight edges. The central sphere has a label capital O printed in black, and the smaller spheres are labelled with capital H.

Here SeeingAI provided the following result.

“Text: WATER MOLECULE
H20
Scene: probably a close up of a ball”.

In this case, the guess is not so far off, though a bit more detail wouldn’t harm a student in the process of learning about the molecular structure of water.

Conclusion? SeeingAI seems to be magic, well done Microsoft, the document recognition is really thought through, and kudos for the speed, usability engineering work put into it, and even the scene recognition, which in our experience is pretty chatty in indoor environments, with surprisingly accurate descriptions going into details not expected from AI applications seen before. HOWEVER, when it comes to graphs or STEM illustrations, the contrast between IRIS’s Natural Intelligence and SeeingAI’s Artificial Intelligence is transparent. Having said that, it isn’t an equivalent statement to that in the future NI and AI could not become complementary to each other.

NanoTip: SeeingAI performing on STEM content

Leave a Reply