Machine Learning: Oriented Text Detection from Natural Scene
This article has been contributed by Lin Ma, Software Engineer and KVM Virtualization Specialist at SUSE.
Oriented text detection in natural scene has attracted considerable attention from several research communities. And indeed, it can be quite important and useful.
Today I’d like to present a demo based on Connectionist Text Proposal Network (CTNP) which is implemented by the TensorFlow framework.
Preparation
The project code I used can be found on GitHub at:
And well – I also made some changes to the code.
My goal now was to train the VGG16 network model in two different KVM guests (VGG16 – also called OxfordNet – is a convolutional neural network architecture named after the Visual Geometry Group from Oxford, who developed it):
- The first one is a SUSE Linux Enterprise Server 15 KVM guest. It contains a passed through Nvidia 970GTX and a USB camera. In addition, I built the upstream TensorFlow 1.12 framework with GPU support plus CUDA 10 plus cuDNN 7.3 in it.
- The second one is an openSUSE Tumbleweed KVM guest. I installed TensorFlow 1.10 from home:mslacken:ml in our Open Build Service (OBS) (be careful though: there is no GPU support yet).
Example 1
In the SUSE Linux Enterprise Server 15 KVM guest, I trained a VGG16 network model following the projects’ instruction. I didn’t modify any hyper parameters. That means with the default hyper parameters, I got 50,000 iterations. The training progress took 5.5 hours – with the help of GPUs!
To demonstrate the results more visibly, I created a short video – just have a look.
Example 2
Actually, in my openSUSE Tumbleweed KVM guest, I didn’t train the VGG16 network model successfully because our TensorFlow package in OBS removes the support of the Amazon Web Services, Google Cloud and Apache Kafka packages.
However, the code of that network model needs the cloud modules during the training. Thus I just used the pre-trained ctpn.pb model for the test.
Without GPU support, I couldn’t create a similar video as in example 1 to demonstrate the results with this guest, because the text detection from the video was extremely slow. Thus I took some pictures instead of creating a video for test. I hope you will nevertheless easily recognize the results on the pictures.
Summary
The two examples above demonstrate that machine learning tasks such as Oriented Text Detection can be performed with SUSE Linux Enterprise Server 15 and openSUSE Tumbleweed.
No comments yet