Inferring User Image-Search Goals under the Implicit Guidance of Users

IEEE based on DATA MINING DOTNET  2014

Abstract:

Search engine companies collect the “database of intentions, the histories of their users’ search queries. These search logs are a gold mine for researchers. Search engine companies, however, are wary of publishing search logs in order not to disclose sensitive information. In this paper, we analyze algorithms for publishing frequent keywords, queries, and clicks of a search log. We first show how methods that achieve variants of k-anonymity are vulnerable to active attacks. We then demonstrate that the stronger guarantee ensured by differential privacy unfortunately does not provide any utility for this problem. We then propose an algorithm show how to set its parameters to achieve probabilistic privacy. We also contrast our analysis of that achieves-in distinguish ability. Our paper concludes with a large experimental study using real applications where we compare ZEALOUS and previous work that achieves k-anonymity in search log publishing. Our results show that ZEALOUS yields comparable utility to k-anonymity while at the same time achieving much stronger privacy guarantees. Search engines play a crucial role in the navigation through the vastness of the web. Today’s search engines do not just collect and index web pages, they also collect and mine information about their users. They store the queries, clicks, IP-addresses, and other information about the interactions with users in what is called a search log.

 EXISTING SYSTEM:

  •  In existing work on privacy-preserving publishing of only frequent items/item sets.
  •  Query substitutions that do not contain the actual keywords of the query.
  •  Existing work on publishing frequent item sets often only tries to achieve anonymity or makes strong assumptions about the background knowledge of an attacker.
  •  Do not generate sub report for main report by any criterions and Report generation in real time is not applicable.

  PROPOSED SYSTEM:

  •  This Proposed paper contains a comparative study about publishing frequent keywords, queries, and clicks in search logs.
  •  We compare the disclosure limitation guarantees and the theoretical and practical utility of various approaches.
  •  Our comparison includes earlier work on anonymity and in-distinguish ability.
  •  Our proposed solution to achieve probabilistic differential privacy in search logs.

 HARDWARE SPECIFICATION:

  •  SYSTEM            : Pentium IV 2.4 GHz
  •  HARD DISK       : 40 GB
  •  RAM                   : 512MB

 SOFTWARE SPECIFICATION:

  •  Application Type        :     Web Application
  •  Frame Work                :      Microsoft VS.NET-2005
  •  Front End                     :      Asp. Net 4.0
  •  Code Behind               :     C#.NET
  •  Backend                       :    Sqlserver 2005