Announcement for Downloading full text filePlease respect the Copyright Act.
All digital full text dissertation and theses from this website are authorized the copyright owners. These copyrighted full-text dissertation and theses can be only used for academic, research and non-commercial purposes. Users of this website can search, read, and print for personal usage. In respect of the Copyright Act of the Republic of China, please do not reproduce, distribute, change, or edit the content of these dissertations and theses without any permission. Please do not create any work based upon a pre-existing work by reproduction, Adaptation, Distribution or other means.
URN etd-0707106-102724 Statistics This thesis had been viewed 2621 times. Download 1812 times. Author Kuen-Jan Shie Author's Email Address email@example.com Department Communication Engineering Year 2005 Semester 2 Degree Master Type of Document Master's Thesis Language English Page Count 67 Title Mathematical Equations Extratcion in the Document Images Keyword OCR Mathematical Equation Document Analysis Document Analysis Mathematical Equation OCR Abstract This paper presents a method for automating the extraction of the mathematical equations from selected document images. This method uses the proposed document analysis procedures to separate the mathematical equations from the textual images without character recognition.
Efficiently extracting the mathematical equations from scientific documents is a key step to an OCR system for recognizing the mathematical equations and improving the accuracy. The proposed method finds the exact areas of mathematical equations and extracts mathematical equations. The located area information is fundamental to recognizing the mathematical equations. Extracting the mathematical equations enables the commercial OCR system to process only the usual text and improves the recognizing rate on the documents containing mathematical equations.
This paper conducts experiments using the scientific document images, which are selected from IEEE and ACM digital libraries, to examine the proposed method. The experiment results show that the proposed method is able to separate the mathematical equations from the given document with the accurate rate more than 94%. In addition, the accuracy evaluations and results comparisons are provided and discussed.
Advisor Committee Cheng-Jen Tang - advisor
Hung-Ta Pai - co-chair
Shuenn-Shyang Wang - co-chair
Files Date of Defense 2006-06-15 Date of Submission 2006-07-07