java,image-processing,ocr,tesseract,tess4j , Next step in image preprocessing for OCR with Tesseract (tess4j)


Next step in image preprocessing for OCR with Tesseract (tess4j)

Question:

Tag: java,image-processing,ocr,tesseract,tess4j

I've been trying to use Tesseract to identify some digits in a series of images and after scouring for advice I've made a number of improvements. So far I've attempted the following steps:

The numbers are all identical in shape and perfectly aligned, though their edges are somewhat jagged. Example processed images:

enter image description here enter image description here

Tesseract does okay with these, but it often confuses 8 for 3, 6 for 5, 9 for 5.

I've been looking a little at different ways to smooth the image and trying different scales, but I'm also wondering if it makes more sense to just go through the process of training Tesseract. With only 10 possible values that are always almost identical, it seems like it shouldn't be too hard for it to learn to recognize them, but training Tesseract also seems like a huge pain.

Any suggestions on how to get the final bit of accuracy out of Tesseract on these images?

I'm using tess4j and Java, so Java-specific suggestions and libraries are especially appreciated. While I'm willing to implement algorithms myself, I'd hate to reinvent the wheel.


Answer:

I tried a few more preprocessing ideas without making much progress, including various types of greyscale, image color inversion, resizing and alternate binarization strategies. None of these were improved over my original, non-resized binarization. Ultimately I decided to give Tesseract training a go. I followed the instructions here: Manual Tesseract Training Walkthrough

I had a hard time finding any programs to help that actually worked in windows 64-bit and ended up doing most of the work by hand. I used jTessBoxEditor to edit the manually generated .box files, though I also did some editing in a text editor to add entries for missing characters the box file generator missed. I only have these small tiffs to work off of, so my training files don't meet the Tesseract wiki guidelines, but oh well.

I got some errors when using box.train:

FAIL! apply_boxes BOXFILE LINE ... failure! COULDN'T FIND A MATCHING BLOB

After unproductive googling I decided to ignore them and press on.

I got more errors when trying to run cntraining:

Error: Illegal number of feature sets!
signal_termination_handler:Error:Signal_termination_handler called:Code 3001

After MORE unproductive googling, I basically tried omitting each of my .tr files in turn to see which one caused the problem. Eventually I was able to complete cntraining with 1 missing file. I have no idea what effect this has on my output, but again I decided to just ignore it and keep going.

I ran into another problem running combine_tessdata:

Error opening unicharset file
Error combining tessdata files into foo.traineddata

This was because I needed to put my lang prefix before the unicharset file, which the tutorial didn't instruct me to do. After doing that, I successfully built A traineddata file. With no idea whether it would work, I dropped it into my tessdata directory, switch my language to the new trained language and tried again.

VOILA, it was perfect. It seems to now recognize my digits with 100% accuracy (at least across my limited sample size). The only preprocessing I'm doing is binarizing the images and no further cleanup or rescaling.

So, apparently with a small charset, manually training is worth the trouble. Took me probably 3 hours to muddle through finding tools that work and kludging my way through the process. For reference, I started with 14 tifs similar to those in my initial post. Four of them had one error or another along the way, plus the 1 I omitted from cntraining (but not from anything else..?), so like.. 9 and 2/3 images for training. It apparently was enough, thanks to the consistency of my characters.


Related:


@RestController throws HTTP Status 406


java,spring,rest,maven
I am working on a basic Hello World program using Spring and Restful webservices. But when I try to call my service I am getting below error message: HTTP Status 406 - description - The resource identified by this request is only capable of generating responses with characteristics not acceptable...

Get the value of the last inserted record


java,jdbc
I am trying to get the stop_name of the last inserted row in the table with preparedStatement. How can I get the last inserted one? I appreciate any help. behavoiur table: CREATE TABLE IF NOT EXISTS behaviour( behaviour_id INT(11) NOT NULL AUTO_INCREMENT PRIMARY KEY, mac VARCHAR(30) NOT NULL, stop_name VARCHAR(30)...

Java dice roll with unexpected random number


java,if-statement
I've written a simple Java program to display the results of 20 dice rolls on the console. The results I'm getting are listed below: 3 1 [email protected] 1 5 4 1 [email protected] 1 [email protected] [email protected] 1 6 [email protected] 1 [email protected] [email protected] 1 2 3 3 When I ran it for...

How to block writes to standard output in java (System.out.println())


java,logging,stdout
In Java, how to block code from writing to system out? My app calls a 3rd party library that spams logs by issuing numerous System.out.println() calls. I don't have legal rights to decompile and patch the library. I'm running Websphere 8.5.x Considered using System.setOut(PrintStream out), but that will effect the...

Finding embeded xpaths in a String


java,regex
I have a string where I have the user should be able to specify xpaths that will be evaluated at runtime. I was thinking about having a the following way to specify it. String = "Hi my name is (/message/user) how can i help you with (/message/message) "; How can...

Form submit portlet with Spring MVC


java,jsp,spring-mvc,liferay,portlet
I'm trying to achieve a Liferay portlet of submit form using spring MVC. The model: package com.model; public class Person { String firstName; String middleName; public String getFirstName() { return this.firstName; } public String getMiddleName() { return this.middleName; } public void setFirstName(String firstName) { this.firstName=firstName; } public void setMiddleName(String middleName)...

Getting particular view from expandable listview


java,android,listview,android-fragments,expandablelistview
In my list view I have an textview in expandable group and I want to open the dialog when textview is clicked to fill the information through edittext and update textview. Problem: how could I get the groupview textview item in my fragment oncreateview() method....

Exception in thread “main” java.util.InputMismatchException: For input string: “1234567891011”


java
input:- 1 3 4 7 9 6 -1 output:- 4 6 What I am doing is print the even numbers from the list, one per line. The list ends when -1 is encountered. Below is the my code: Scanner sc=new Scanner(System.in); for (int i = 0; i <100000000; i++) {...

Get element starting with letter from List


java,android,list,indexof
I have a list and I want to get the position of the string which starts with specific letter. I am trying this code, but it isn't working. List<String> sp = Arrays.asList(splited); int i2 = sp.indexOf("^w.*$"); ...

How to check if an ExecutionResult is empty in Neo4j


java,neo4j
I am executing a query in neo4j using java. I don't know how to check if the execution result is empty. Here is my code. String firstquery=" start n=node(*) match(b)-[relatedto]->(n) where b.value=\""+applicationName+"\" return n"; ExecutionResult execResult = execEngine.execute(firstquery); System.out.println("executing query"); String results = execResult.dumpToString(); System.out.println(results); Iterator<Node> n_column1 = execResult.columnAs( "n"...

PropertyNotFoundException in jsp


java,jsp
Am getting this error in my application javax.el.PropertyNotFoundException: Property 'survey_id' not found on type com.moh.forms.MOH731 javax.el.BeanELResolver$BeanProperties.get(BeanELResolver.java:229) javax.el.BeanELResolver$BeanProperties.access$400(BeanELResolver.java:206) javax.el.BeanELResolver.property(BeanELResolver.java:317) javax.el.BeanELResolver.getValue(BeanELResolver.java:85) This is my MOH731.java @Id @GeneratedValue(strategy = GenerationType.AUTO) private int survey_id; public MOH731 (int survey_id, String uname)...

Interpreting hex dump of java class file


java,class,hex
I understand the structure of a java .class file, but when I want to interpret the raw hex data I get a bit lost. This is a hex dump of a class file, excluding the header and constant pool. I understand the header to be the magic number, minor_version and...

Unfortunately, (My app) has stopped. Eclipse Android [duplicate]


java,android,eclipse,adt
This question already has an answer here: What is a Null Pointer Exception, and how do I fix it? 12 answers I'm a beginner in android developing and I'm trying to build a simple app but I'm getting this error in the emulator.(Unfortunately,(App) has unexpectedly stopped). LogCat http://i.stack.imgur.com/VZhuL.png package...

Android set clickable text to go one fragment to another fragment


java,android,android-fragments,spannablestring
I need to do something like this. Suppose I have 2 fragments A and B.There is a text which can be clickable in fragment A and when user click this text , he can go to fragment B. This example helped me to do it but I think it does...

why java API prevents us to call add and remove together?


java,list,collections,listiterator
As per Java API- IllegalStateException - if neither next nor previous have been called, or remove or add have been called after the last call to next or previous remove()- Removes from the list the last element that was returned by next() or previous() (optional operation). This call can only...

Reading and modifying the text from the text file in Java


java
I am have a project that need to modify some text in the text file. Like BB,BO,BR,BZ,CL,VE-BR I need make it become BB,BO,BZ,CL,VE. and HU, LT, LV, UA, PT-PT/AR become HU, LT, LV, UA,/AR. I have tried to type some code, however the code fail to loop and also,in this...

Get network interfaces on remote machine


java,network-programming
I know in java we can do NetworkInterface.getNetworkInterfaces() to get all available network interfaces on local machine. Can we do similar thing where I can pass host name to get the NICs(with IPs) which are up and running?

how to call Java method which returns any List from R Language? [on hold]


java,r,rjava
How to call java method which returns list from R Language.

setOnClickListener error Null object


java,android
I saw many questions like mine and I try to fix my codes but failed... this is my code: public class MainActivity extends ActionBarActivity { EditText usernameEditText; EditText passwordEditText; public Button saveme; @Override protected void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); usernameEditText = (EditText) findViewById(R.id.username); passwordEditText = (EditText) findViewById(R.id.pass); saveme = (Button)...

Reading all the files in sequence in MATLAB


matlab,image-processing
I am trying to read all the images in the folder in MATLAB using this code flst=dir(str_Expfold); But it shows me output like this. which is not the sequence as i want. Can anyone please tell me how can i read all of them in sequence? for giving downmark, please...

WebDriver can't get dropdown menu element (Java)


java,selenium,webdriver,junit4
I write a script on Java for Selenium WebDriver, and I have a problem with selected from dropdown menu. Here's my locator: new Select(driver.findElement(By.id("FormElement_select_68_input_input"))).selectByVisibleText("Image"); Here's an error: http://prntscr.com/7jul03 Here's HTML code: http://prntscr.com/7jvou6 Need to select "Image" from this menu, but have an error. Before I had the error like this,...

viewResolver with more folders inside of WEB-INF/jsp is not working in spring


java,spring,jsp,spring-mvc
<bean id="viewResolver" class="org.springframework.web.servlet.view.InternalResourceViewResolver"> <property name="viewClass" value="org.springframework.web.servlet.view.JstlView" /> <!-- avoid '???' --> <property name="prefix" value="/WEB-INF/jsp/" /> <property name="suffix" value=".jsp"/> </bean> if i create other subfolders under jsp , for instance /WEB-INF/jsp/reports , /WEB-INF/jsp/insertions how should i configure now the viewResolver to can resolve these new sub folders??...

Dynamic creation of objects vs storing them as fields


java,performance,object
I have this loop which runs about 1,000,000 times. while ((line = br.readLine()) != null) { if(!(line.isEmpty())){ data = new JSONObject(line); name = (String)data.get("name"); surname = (String)data.get("surname"); ar.put(name, surname); //an array } } What are the advantages (if any) in terms of efficiency of having it like that were data,name...

Get document on some condition in elastic search java API


java,elasticsearch,elasticsearch-plugin
As I know we can parse document in elastic search, And when we search for a keyword, It will return the document using this code of java API:- org.elasticsearch.action.search.SearchResponse searchHits = node.client() .prepareSearch() .setIndices("indices") .setQuery(qb) .setFrom(0).setSize(1000) .addHighlightedField("file.filename") .addHighlightedField("content") .addHighlightedField("meta.title") .setHighlighterPreTags("<span class='badge badge-info'>") .setHighlighterPostTags("</span>") .addFields("*", "_source")...

custom arraylist get distinct


java,android
is there any way to get the distinct values of a custom arraylist? public class mystatistic extends BaseActivity { public String objectid; public String playerid; public String playername; public String enemyid; public String enemyname; public String question; public mystatistik(String objectid, String playerid, String playername, String enemyid, String enemyname, String question)...

Selenium catch popup on close browser


java,selenium,browser
I'm trying to test that when I close my window a popup shows with a warning message. I've tried both driver.close() and driver.quit() after making sure I'm on the proper window but this just terminates the process since my popup doesn't show. I could test it by using the awt...

Get current latitude and longitude android


java,android,gps,geolocation,location
Requirement: 1.Sometimes(not everytime) I am getting latitude and longitude 0.0. 2.I want to know how to get the location update after user has enabled the gps from the settings. Here is my code public class GPSTracker extends Service implements LocationListener { private final Context mContext; // flag for GPS status...

SOAP Client, Following an example


java,soap,saaj
I need to consume a SOAP Server named "Mouser" for my company. However I have a problem when I try to send a message. The documentation of my request is : POST /service/searchapi.asmx HTTP/1.1 Host: www.mouser.fr Content-Type: application/soap+xml; charset=utf-8 Content-Length: length <?xml version="1.0" encoding="utf-8"?> <soap12:Envelope xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:soap12="http://www.w3.org/2003/05/soap-envelope"> <soap12:Header> <MouserHeader...

A beginner questions about printf, java


java,string,printf
I'm learning Java by myself and through tutorials online. Just wondering in a printf statement, what does the different %s, %d, %15, %7, %12.2(and so on...) mean? Couldn't find any explanation anywhere online, so I'm turning to you. ...

Join files using Apache Spark / Spark SQL


java,apache-spark,apache-spark-sql
I am trying to use Apache Spark for comparing two different files based on some common field, and get the values from both files and write it as output file. I am using Spark SQL for joining both files (after storing the RDD as table). Is this the correct approach?...

Android String if-statement


java,android,string
I have a if-statement in the start of my app if (ready.equals("yes")){ ... } and later on my code I have ready="yes"; but the if statement is never called, why? The ready="yes"; is called from a background thread, is that why? public void DownloadFromUrl(final String fileName) { //this is the...

App Not Downloading Newest Version Of File [Java]


java,caching,download
Okay, I've been trying to figure this out for a few hours and it's starting to kill me. I wrote a primitive version checker for an app I work on every once and awhile. It's just a simple for fun project. The version checker has been a pain though. It...

type conversion if flex


java,actionscript-3,flex
I am using Flerry as Java-Flex bridge for my Flex Desktop Application. How to convert List in Java to ArrayCollection in Flex Flex Code:- [Bindable]public var screenList:ArrayCollection; <flerry:NativeObject id="windowControllerObj" source="ls.window.EnumAllWindowNames" singleton="true" fault="windowControllerObj_faultHandler(event)"> <flerry:NativeMethod id="getWindowNames" name="getAllWindowNames" result="windowControllerObj_resultHandler(event)" fault="getWindowNames_faultHandler(event)"/>...

Bulkheading strategies for Akka actors


java,asynchronous,akka,blocking,future
I have a scenario where an important actor needs to make a call to a slow (15 - 20 seconds) remote system: // Non-actor code equivalent public Result makeSlowNetworkCall(Request request) { Result result = slowServiceClient.soooooSlow(request); // Could be up to 15 - 20 SECONDS (mehhhh) return result; } The Akka...

How can implement long running process in spring hibernate?


java,spring,hibernate
as we know the struts interceptor execute and wait will take care of long running process by not getting the request to timeout and destroy it sends wait and at last the desired response i want to implement the same for long running process in spring and hibernate. Thanks....

Javadoc: Do parameter and return need an explicit type description


java,types,javadoc
When Javadoc'ing, I don't know whether you should explicitly say whether the parameters are of type String or int. For example /** * This method does something * @param foo an object of type Foo * @param abc the number of doors, of type int * @return the number of...

How can I replace the white rectangle within an image using ImageMagick?


php,image-processing,imagemagick
Overview: The first picture is my original image. Here I want to replace the white rectangle shown with another image. My approach: I have created a mask image using floodfill and it looks as: Problem: Now I would like to get the distance or co-ordinates of the rectangle in the...

Logging operations in lightadmin


java,spring,logging,lightadmin
I have a Java web application which use Spring and Hibernate and I plan to use lightadmin to provide an administration interface. However, I found very little information about the logging part of lightadmin : if I have such an adminsitration interface, I would like that any operation made to...

BitmapFont class does not have getBound(String) method


java,android,libgdx
Previously I used getBound method for BitmapFont class in libgdx, but now, I am unable to use it. I cannot find the changes in latest version. Any help is appreciated. Thank you...

How to do custom rounding of numbers in Java?


java,rounding
Suppose I want to round numbers that have mantissa greater than 0.3 'up' and those below 'down'. How can I do it in Java? The only thing that came to my mind was Math.round(), but I can't seem to make it follow a certain rule....

Android Implicit Intent for Viewing a Video File


java,android,android-intent,uri,avd
In my Android app, I have a button that when clicked, launches the external application of my choice to play a video (I gather that this is called an "implicit intent"). Here is the relevant Java code from my onCreate method. Button button = (Button) findViewById(R.id.button); button.setOnClickListener ( new Button.OnClickListener()...

How to call MySQL view in Struts2 or Hibernate


java,mysql,hibernate,java-ee,struts2
I have a view in MySQL database CREATE VIEW CustInfo AS SELECT a.custName, a.custMobile, b.profession, b.companyName, b.annualIncome FROM customer a INNER JOIN cust_proffessional_info b ON a.cust_id=b.cust_id Is there any way that i can call this view using Struts2 or in Hibernate. I have tried to search it but could not...

count items in a column vaadin


java,vaadin
Good day everyone! How do I get the number of items that a column contains? I want to hide the "Workgroup" column if it's empty, I just don't know how. Table tblTeachings = new Table(); tblTeachings.addContainerProperty("#", Integer.class,0); tblTeachings.addContainerProperty("Course",String.class,""); tblTeachings.addContainerProperty("Teaching name",String.class,""); tblTeachings.addContainerProperty("Start date",String.class,""); tblTeachings.addContainerProperty("End Date",String.class,""); tblTeachings.addContainerProperty("Activity...

Numeric literals in Java - octal? [duplicate]


java,literals,octal
This question already has an answer here: How does a leading zero change a numeric literal in Java? 3 answers Here is some code in java on datatypes: class Test { public static void main(String args[]) { int i = -0777; System.out.println(i); } } The output of the above...

Can I install 2 or more Android SDK when using Eclipse


java,android,eclipse,sdk,versions
I am new to android developments. I am setting up my android development environment using Eclipse. I have a test smart phone with Android version 4.2.2. The automatic installation installed the latest Android SDK version, which is 5.1.1. My questions are: 1. Do I have to install the SDK version...

Mysterious claim of a missing { in eclipse


java,eclipse
I have a case of a mysterious missing curly brace that I don't see any use for. Eclipse says, "Syntax error on token ";", { expected after this token". Either I am missing something very silly or there is something new I have to learn about Java. This is the...

Using world coordinates


java,libgdx
I am currently using pixels as units for placing objects within my world, however this can get tedious because I only ever place objects every 16 pixels. For example, I would like to be able to place an object at position 2 and have the object rendered at the pixel...

error: cannot find symbol class AsyncCallWS Android


java,android,web-services
I'm trying to rewrite the application in this, in Android Studio link, which is written in Eclipse. There are two problems, first problem is there is this line in the project : import com.example.webserviceactivity.R; I couldn't write this one on Android Studio. The second problem is, in this part of...

Java Scanner not reading newLine after wrong input in datatype verification while loop


java,while-loop,java.util.scanner
I've looked at similar questions and tried to follow the answers that solved the issues that others have had but putting the sc.next() or sc.nextLine() after the while loop causes it to go into an infinite loop. The problem is if a user enters incorrect input (nothing or a number)...

Connecting two binary objects in matlab


matlab,image-processing
I have a binary matrix containing several binary objects and I want to bridge between them. Actually I have the following picture: And the result has to be like this: Is there any function or a shortcut way, other than loops, for this problem?...