File:A Supervised Learning Approach For Heading Detection.pdf

Go to page
next page →
next page →
next page →

Original file(1,275 × 1,650 pixels, file size: 7.13 MB, MIME type: application/pdf, 19 pages)

Captions

Captions

Add a one-line explanation of what this file represents

Summary edit

Description
English: As the Portable Document Format (PDF) file format increases in popularity, research in analysing its structure for text extraction and analysis is necessary. Detecting headings can be a crucial component of classifying and extracting meaningful data. This research involves training a supervised learning model to detect headings with features carefully selected through recursive feature elimination. The best performing classifier had an accuracy of 96.95%, sensitivity of 0.986 and a specificity of 0.953. This research into heading detection contributes to the field of PDF based text extraction and can be applied to the automation of large scale PDF text analysis in a variety of professional and policy based contexts.
Date
Source Content available at arXiv.org (Dedicated link) (archive.org link)
Author Sahib Singh Budhiraja, Vijay Mago

Licensing edit

Creative Commons CC-Zero This file is made available under the Creative Commons CC0 1.0 Universal Public Domain Dedication.
The person who associated a work with this deed has dedicated the work to the public domain by waiving all of their rights to the work worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law. You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission.

File history

Click on a date/time to view the file as it appeared at that time.

Date/TimeThumbnailDimensionsUserComment
current17:45, 8 November 2018Thumbnail for version as of 17:45, 8 November 20181,275 × 1,650, 19 pages (7.13 MB)Acagastya (talk | contribs)User created page with UploadWizard

There are no pages that use this file.

File usage on other wikis

The following other wikis use this file:

Metadata