Engineering collaborations for accessing hidden web resources

Keywords: Collaborative approach; Hidden Web; Deep Web; Information retrieval.

Abstract

The world wide web has always been a research platform for information and data scientists. The value of information is generally seen as a proportional to the depth of the world wide web. While the surface web is being taken good care of by the search engines at a great length, It is the deep web that posed challenges to the information and data scientists. A lot of research has been expended in the direction of extracting information from the deep web resources. Many researchers have studied the problem statement in their perspective and devise solutions to address their formulated problem statements. Individually these efforts showed promising results. However, there has been a gap of the joint effort to bring these individual efforts together to reap the collective fruits to address the much larger and more sensible problem statement. This paper highlights the different problem statements of different researchers and suggests the collaborative approach to bring out one common problem statement to envisage one big problem and its probable solution.

Downloads

Download data is not yet available.

References

[1] A. Ntoulas, P. Zerfos, J.Cho., “Downloading Textual Hidden Web Content Through Keyword Queries,” in Proceedings of the 5th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL'05, Denver, USA, Jun 2005 IEEE , pp. 100-109.
[2] The size of the World Wide Web (The Internet) [Online], Available: https://www.worldwidewebsize.com/, Accessed on 10 June 2019.
[3] Bergman and M.K, “The Deep Web: Surfacing hidden value,” Journal of Electronic Publishing, vol. 7, no.1, pp. 1-17, 2001.
[4] S. Lawrence and C.L. Giles, “Searching the World Wide Web,” Science, International Journal of Science, vol. 280 no.5360, pp. 98–100, 1998.
[5] S. Lawrence and C.L. Giles, “Accessibility of information on the web,” Nature, International Journal of Science, vol. 400, no. 107. https://doi.org/10.1038/21987, 1999.
[6] K.Khurana and M.B Chandak, “Survey of Techniques for Deep Web Source Selection and Surfacing the Hidden Web Content”, International Journal of Advanced Computer Science and Applications, Vol 7, No.5, pp. 409-418, 2016.
[7] M. Singh and Anuradha, “HWPDE: Novel approach for data extraction from structured web pages,” International Journal of Computer Applications, vol. 50, no. 8, pp. 22-27, July 2012.
[8] M. Singh and J.S. Prasad, “All Domain Hidden Web Exposer Ontologies: A Unified approach for excavating the web to unhide deep web,” in Proceedings of International Conference on Smart Innovations in Communication and Computational Sciences, Indore, India, Ed. by Springer, Singapore, pp. 423-431, 2018.
[9] M. Singh and J.S. Prasad. “UDDWE: Universal Domain Deep Web Exposer,” International Journal of Engineering and Technology (UAE), vol. 7 no.4 pp 4398-4404, 2018.
[10] Y. Wang and J. Hu, “A machine learning based approach for table detection on the web,” in Proceedings of the 11th international conference on World Wide Web, New York, NY, USA, ACM Press, pp. 242–250, 2002.
[11] B. Liu, R. Grossman, and Y. Zhai, “Mining data records in web pages,” In Proceedings of the ninth ACM SIGKDD International Conference on Knowledge discovery and data mining, New York, NY, USA, ACM Press, pp. 601–606, 2003.
[12] Felix Weigel, Biswanath Panda, and Mirek Riedewald, “Large-Scale Collaborative Analysis and Extraction of Web Data”, VLDB endowment, pp. 1476-1479, 2008
[13] Baker Ross Inspiring Creativity, Available Online at https://www.bakerross.co.uk/wooden-jigsaw-puzzles, Accessed on August 24, 2019.
[14] J.T. McDonald, M.L. Talbert, and S.A. DeLoach, “Heterogeneous database integration using agent-oriented information systems,” in Proceedings of the International Conference on Artificial Intelligence (IC-AI’ 2000), Monte Carlo Resort, Las Vegas, Nevada, CSREA Press, vol. 3, pp. 1359-1365, June 26–29, 2000
Published
2021-12-15
How to Cite
Sehgal, M. S. (2021). Engineering collaborations for accessing hidden web resources. PREPARE@u® | IEI Conferences. https://doi.org/10.36375/prepare_u.iei.a203
Section
- 36.IEC | Computer Engineering