Boilerplate Detection Using Shallow Text Features
Published on Oct 07, 201023676 Views
In addition to the actual content Web pages consist of navigational elements, templates, and advertisements. This boilerplate text typically is not related to the main content, may deteriorate search