![]() Because some languages can be expressed in vertical and horizontal direction, our method can extract two-directional videotext instead of the traditional methods which only focus on one-directional videotext extraction. Hereby, we propose an integrated solution for a two-directional videotext extractor, called 2DVTE. In this paper, we make a discussion about the first three steps because the discussion of recognition step is a relative mature work and further application beyond of this paper. The summarization of video content could be generated by these steps. The recognition step is accomplished by OCR software. The extraction step is to eliminate the background pixels and the text pixels left are performed by text recognition. The localization step is to determine the accurate boundaries of the text strings. The detection step is to classify regions into text and non-text regions. Videotext recognition basically contains four steps: detection, localization, extraction, and recognition. To solve these problems, an efficient and robust text detection, localization, and extraction method is necessary. However, the detection and extraction of the text is difficult due to the various cases of text and complex background, etc. Thus, the critical technique for this purpose is to detect and extract text accurately. Indexing and retrieval of news program and sports broadcast benefit from such a keyword based indexing system as shown in Fig. ![]() The detected text can be the text descriptors in MPEG-7 for other applications, , and can be also used in the MPEG-7 Description Scheme, for annotating the media content. If these text occurrences could be detected, segmented, and recognized automatically, they would represent a valuable source of high level semantics for indexing and retrieval. When obtaining the video content from TV broadcasting or digital broadcasting, this superimposed text has been embedded in the frames in the encoding process. Especially the superimposed text is intended to carry and stress important information in video, since it is typically generated by video title machines or graphical font generators in studios. Considering all the information obtained from video, text in video is most reliable for this purpose. Textual, visual, and audio information are most frequently used for users to locate their desired video content. To enable users to quickly locate their interested content in an enormous quantity of video data, many researchers have been devoted to video indexing and summarization. Nowadays, there is a lot of video content obtained from TV broadcasting, internet, and wireless network. In our analysis, the performance of our algorithm is superior to other existing methods in speed and quality. We also make comparisons with other methods. According to the experiment results on various video sequences, all of the horizontal and vertical scrolling videotexts can be extracted precisely. ![]() Considering high throughput and the low complexity issue, we can achieve a real-time system on detecting, localizing, and extracting the scrolling videotexts with only one frame usage instead of multi-frame integration in other literatures. With this multi-seed exploitation on strokes, precise seeds are obtained to produce more sophisticated videotext. Referring to the multi-seed filling algorithm, it is based on the consideration of the minimum and maximum length and four directions of the stroke while the previous method only considers the minimum length and two directions of the stroke. In the dual mode adaptive thresholding, it produces the non-rectangle pattern to divide the background and foreground more precisely. Third, the extraction method consists of dual mode adaptive thresholding and multi-seed filling algorithm. Considering the characteristics of Chinese text, the vertical edge map is used to localize the possible text region and horizontal edge map is used to refine the text region. Second, referring to the localization on scrolling videotext, we propose the two-dimensional projection profile method with horizontal and vertical edge map information. First, the detection method is carried out by edge information to classify regions into text and non-text regions. It is developed as an integrated system to detect, localize and extract the scrolling videotexts. In this paper, we propose a two-directional videotext extractor, called 2DVTE. Few methods can handle motion videotext efficiently since motion videotext is hardly extracted well. Most videotext detection and extraction methods only deal with the static videotext on video frames. In video indexing and summarization, videotext is the very compact and accurate information.
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |