Seminar: Probabilistic Methods for Assessing Information Quality
DescriptionAssessing the correctness of claims coming from different sources is of tremendous importance for many information integration and reconciliation tasks. Such tasks encompass the aggregation and sharing of news, social editing, the description of e-Commerce products, the consolidation of database entries (e.g., after the merger of companies), etc. In the absence of a ground truth, one of the main challenges is the reconciliation of contradicting claims from sources that may deliver noisy, outdated, erroneous, or incomplete information. For example, for the same flight, different flight booking web sites may report different arrival times. The task of estimating the correctness of claims by aggregating the information from the available sources is commonly referred to as latent truth discovery (LTD). This seminar focuses on state-of-the-art approaches to the LTD problem, which go well beyond majority voting schemes, and instead jointly infer source reliabilities and the correctness of claims through probabilistic inference schemes. |