Menu

Shallow Semantic Parsing of Product Offering Titles (for better auto-hyperlink insertion)

calendar icon Oct 7, 2014 2241 views
split view icon
video icon
presentation icon
video with chapters icon
video thumbnail
Pause
Mute
speed icon
speed icon
0.25
0.5
0.75
1
1.25
1.5
1.75
2

With billions of database-generated pages on the Web where consumers can readily add priced product offerings to their virtual shopping cart, several opportunities will become possible once we can automatically recognize what exactly is being offered for sale on each page. We present a case study of a deployed data-driven system that first chunks individual titles into semantically classified sub-segments, and then uses this information to improve a hyperlink insertion service. To accomplish this process, we propose an annotation structure that is general enough to apply to offering titles from most e-commerce industries while also being specific enough to identify useful semantics about each offer. To automate the parsing task we apply the best-practices approach of training a supervised conditional random fields model and discover that creating separate prediction models for some of the industries along with the use of model-ensembles achieves the best performance to date. We further report on a real-world application of the trained parser to the task of growing a lexical dictionary of product-related terms which critically provides background knowledge to an affiliate-marketing hyperlink insertion service. On a regular basis we apply the parser to offering titles to produce a large set of labeled terms. From these candidates we select the most confidently predicted novel terms for review by crowd-sourced annotators. The agreed on terms are then added into a dictionary which significantly improves the performance of the link-insertion service. Finally, to continually improve system performance, we retrain the model in an online fashion by performing additional annotations on titles with incorrect predictions on each batch.

RELATED CATEGORIES

MORE VIDEOS FROM THE EVENT

MORE VIDEOS FROM THE SAME CATEGORIES

Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International license.