August 2 @ 12:00 pm - 1:00 pm
DataBytes is an NCDS-sponsored lunchtime webinar series that gives our members and the larger data community the chance to discuss the most pressing and interesting issues in data science. Webinars are generally held on the first Wednesday of every month, excluding July. All webinars are easily accessible from the NCDS homepage and from this page.
For the August talk, Marcello Balduccini, assistant research professor at Drexel University, will present.
Abstract: Information Retrieval (IR) is arguably a staple of everyday life, and is at the core of a number of commercial activities. Every day, people consult Wikipedia, search private databases for patient information, and search public databases for scientific publications or news about partners and competitors. It is not a stretch to say that IR is a fundamental driver of several important areas, including healthcare, scientific discoveries, and competitive advantages for industry.
Although great progress has been made in IR over the years, state-of-the-art techniques still lag behind user needs. IR techniques are still largely based on identifying syntactical matches of words between available sources and a user’s query. On the other hand, successfully matching documents and queries related to events and corresponding states of the world requires more sophisticated matching strategies, which go beyond the syntactical structure.
In this talk, Balduccini will give an overview of his recent research aimed at bridging this gap by developing a new, more powerful kind of IR, which we call action-centered IR. Action-centered IR is aimed at subsuming and generalizing traditional IR techniques by introducing semantic-level matching techniques that make it possible to accurately match queries and documents related to events and to the state of the world before, during, and after them.