108學年第1學期課程綱要

@尊重智慧財產權,請同學勿隨意影印教科書 。
Please respect the intellectual property rights, and shall not copy the textbooks arbitrarily.

一、課程基本資料
開課序號 0540 課程學制
科目代碼 ISM0800 課程名稱 資訊檢索原理與應用
英文名稱 Principles and Applications of Information Retrieval
全/半年 必/選修 選修
學分數 3.0 每週授課時數 正課時數: 3 小時
開課系級 圖資所(碩)
先修課程
課程簡介 資訊檢索所涵蓋的意義與範疇十分廣泛,就圖書資訊學角度,資訊檢索是一門銜接圖書資訊學與現代資訊科技的領域知識;就資訊學角度,資訊檢索已擴展到知識管理、資料科學、自然語言處理、人工智慧等領域,成為必須具備的知識與技能。資訊檢索研究不斷嘗試運用電腦及網路科技對資訊在篩選、擷取、儲存、分類、索引、摘要、檢索、傳播、及分享等進行提昇。資訊檢索的原理與應用不僅是跨領域的基礎知識,也是各種產業應用、學術研究的重要方向。 本課程將以搜尋引擎系統為主體,介紹其相關技術與應用,讓學生建立資訊檢索之相關基礎知識概念。同時,課程中也將針對近年來一些重要資訊檢索趨勢進行研討,特別是主流國際學術性會議所包涵之研究議題,如深度學習、文本自動生成(如對話系統)。此外,課程也將安排網路搜尋引擎、文件分類、主題歸類、自動摘要、意見分析、關聯詞擷取、文本生成之實作,希望同學除熟悉資訊檢索之相關理論外,也能學會實際應用。
課程目標 對應系所核心能力
1. 瞭解資訊檢索之重要相關理論‧ 瞭解資訊使用者之檢索行為模式‧ 具備資訊檢索系統之實作與評鑑技能‧ 掌握資訊檢索研究發展趨勢 碩士:
 1-3 探究資訊使用者及資訊使用之理論與方法
 2-1 具備問題分析及解決的能力
 2-2 具備整合應用數位與網路科技的能力
 2-3 具備規劃及評估資訊系統的能力
 4-1 以人為本,尊重知識,整合運用資訊科技與創新服務,促進知識之自由與有效使用。

二、教學大綱
授課教師 曾元顯
教學進度與主題

週次

課程內容

備註

Course Overview

 

Search Engines and Information Retrieval

Chap.1

Architecture of a Search Engine

Chap. 2

系統實作指導 I

 

Crawls and Feeds; Crawler Implementation

 Chap. 3

Processing Text, Term Extraction, Word Embedding

 Chap. 4

系統實作指導 II

 

Ranking with Indexes, Page Ranking

Chap. 5

Queries and Interfaces, Chatbot Q&A Systems

Chap. 6

Retrieval Models, Knowledge Graph Reasoning

Chap. 7

十一

系統實作指導 III

 

十二

Evaluating Search Engines, Evaluation Metrics

Chap. 8

十三

Classification and Clustering, Machine Learning

Chap. 9

十四

Social Search and Human Factors

Chap. 10

十五

Beyond Bag of Words, Deep Learning

Chap. 11

十六

演講:User Information Retrieval Behavior

 

十七

演講:Mobile User Interface

 

十八

期末系統展示

 

教學方法
方式 說明
講述法 以投影片講述課程內容
討論法 以即時回應系統(IRS:http://pro.ccr.tw/)詢問同學反應,並激發討論
實驗/實作 給予搜尋引擎、文件分類、主題歸類等工具,請同學安裝使用
其他 論文研讀
評量方法
方式 百分比 說明
作業 20 % 會有幾次作業,如安裝相關系統並呈現結果、線上隨堂考、線上遠距小考
課堂討論參與 10 % 課堂發問、回應、討論情形
出席 10 % 課堂出席情形
報告 30 % 依每次上課指定閱讀章節內容,進行口頭報告及討論
專題 30 % 課程中將教授相關Open Source Software。每位同學需規劃一主題,利用所學習到的工具,實際完成一小型網路搜尋引擎、文件分類、主題歸類、自動摘要、文本生成,並於期末展示成果。
參考書目

指定閱讀

Croft, B., Metzler, D., & Strohman, T. (2015). Search Engines: Information Retrieval in Practice. Addison-Wesley. https://ciir.cs.umass.edu/irbook/.

 

BOOKS

General IR (CS)

  1. Stefan Büttcher, Charles L. A. Clarke, & Gordon V. Cormack (2016). Information Retrieval: Implementing and Evaluating Search Engines, MIT Press.
  2. Baeza-Yates, R., & Ribeiro-Neto, B. (2011). Modern Information Retrieval: The Concepts and Technology behind Search, 2nd ed. Addison-Wesley.Chowdhurry, G.G. (2010). Introduction to Modern Information Retrieval, 3rd ed. Neal-Schuman.
  3. Grossman, D.A., & Frieder, O. (2004). Information Retrieval: Algorithms and Heuristics. 2nd ed. Springer.
  4. Hearst, M.A. (2009). Search User Interfaces. Cambridge University Press.
  5. Manning, C.D., Raghavan, P., & Schütze, H. (2008). Introduction to Information Retrieval. Cambridge University Press. (http://nlp.stanford.edu/IR-book/)
  6. Salton, G. (1983). Introduction to Modern Information Retrieval. McGraw-Hill.
  7. Sparck Jones, K., & Willett, P. (1997). Readings in Information Retrieval. Morgan Kaufmann.
  8. van Rijsbergen, C.J. (1979). Information Retrieval. Butterworths.

 

General (LIS)

  1. Goker, A., & Davies, J. (ed) (2009). Information Retrieval: Searching in the 21st Century. Wiley.
  2. Harter, S.P. (1986). Online Information Retrieval: Concepts, Principles and Techniques. Academic Press.
  3. Hunter, E.J. (2009). Classification Made Simple: An Introduction to Knowledge Organisation and Information Retrieval, 3rd ed. Ashgate.
  4. Lancaster, F.W., & Warner, A.J. (1993). Information Retrieval Today. Info Resources Press.
  5. Saracevic, T., & Marchionini, G., (ed) (2012). Relevance in Information Retrieval. Morgan & Claypool.

 

Search Engines

  1. Langville, A.N., & Meyer, C.D. (2012). Google`s PageRank and Beyond: the Science of Search Engine Rankings. Princeton University Press.
  2. Battelle, J. (2005). The Search: How Google and its Rivals Rewrote the Rules of Business and Transformed Our Culture. Nicholas Brealey.

 

Search Engine Optimization (SEO)

  1. Kennedy, A.F., & Hauksson, K.M. (2012). Global Search Engine Marketing: Getting Better International Search Engine Results. Que.
  2. Enge, E., et al. (2009). The Art of SEO: Mastering Search Engine Optimization. O`Reilly.
  3. Lieb, R. (2009). The Truth about Search Engine Optimization. FT Press.

 

Cross-Language IR

  1. Peters, C., Braschler, M., & Clough, P. (2012). Multilingual Information Retrieval: from Research to Practice. Springer.

 

Multimedia IR

  1. Müller, M. (2007). Information Retrieval for Music and Motion. Springer.
  2. Ras, Z.W., & Wieczorkowska, A. (ed) (2010). Advances in Music Information Retrieval. Springer.

 

Web Mining

  1. Kaushik, A. (2009). Web Analytics 2.0: The Art of Online Accountability and Science of Customer Centricity. Sybex.
  2. Chakrabarti, S. (2002). Mining the Web: Analysis of Hypertext and Semi Structured Data. Morgan Kaufmann.
  3. Liu, B. (2011). Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data, 2nd ed. Springer.
  4. Web NER ToolKit (2019, DS4NER), https://sites.google.com/site/nculab/projects/web-ner-tool

 

Search as Learning

  1. Vakkari, P. (2016). Searching as Learning. A Systematization based on Literature. Journal of Information Science 42(1): 7-18.
  2. Carsten Eickhoff, Jacek Gwizdka, Claudia Hauff, Jiyin He (2017). Introduction to the special issue on search as learning. https://link.springer.com/article/10.1007/s10791-017-9315-9
  3. Search as Learning at CIKM 2018 - Claudia Hauff, https://chauff.github.io/2018-08-07-sal-at-cikm/

 

Semantic Web

  1. Davies, J., Studer, R., & Warren, P. (ed) (2006). Semantic Web Technologies: Trends and Research in Ontology-based Systems. Wiley.
  2. Allemang, D., & Hendler, J. (2011). Semantic Web for the Working Ontologist, 2nd ed. Morgan Kaufmann.

 

User Behavior

  1. Ingwersen, P., & Järvelin, K. (2005). The Turn: Integration of Information Seeking and Retrieval in Context. Springer.
  2. Ruthven, I., &, Kelly, D. (ed) (2011). Interactive Information Seeking, Behaviour and Retrieval. Facet.
  3. Spink, A., & Jansen, B.J. (2004). Web Search: Public Searching of the Web. Springer.
  4. Warner, J. (2009). Human Information Retrieval. The MIT Press.

 

Socio-Technical Aspects

  1. Brin, D. (1998). The Transparent Society: Will Technology Force Us to Choose Between Privacy and Freedom? Basic Books.
  2. Huberman, B.A. (2001). The Laws of the Web: Patterns in the Ecology of Information. MIT Press.
  3. Lesser, E.L., Fontaine, M.A., & Slusher, J.A., eds. (2000). Knowledge and Communities. Butterworth-Heinemann.
  4. Lessig, L. (1999). Code and Other Laws of Cyberspace. Basic Books.
  5. 吳世弘 (2017) “The CYUT System on Social Book Search Track since INEX 2013 to CLEF 2016”, 圖書館學與資訊科學, 43(2), pp. 6-19. http://140.122.104.2/ojs../index.php/jlis/article/view/733 

 

Neural Approaches to Information Retrieval

  1. (Word Embedding) Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. Paper presented at the Advances in neural information processing systems.
  2. Word embeddings in 2017: Trends and future directions: http://ruder.io/word-embeddings-2017/ (OOV handling, Subword-level embeddings, Multi-sense embeddings, Phrases and multi-word expressions).
  3. (fastText from Facebook) Joulin, A., Grave, E., Bojanowski, P., & Mikolov, T. (2016). Bag of Tricks for Efficient Text Classification. CoRR, abs/1607.01759.
  4. (Deep Learning Model from Google) Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2018). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv. https://arxiv.org/pdf/1810.04805.pdf
  5. (GT2: Deep Learning Model from OpenAI) Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language Models are Unsupervised Multitask Learners. Retrieved from https://d4mucfpksywv.cloudfront.net/better-language-models/language_models_are_unsupervised_multitask_learners.pdf
  6. (XLNet) Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Ruslan Salakhutdinov, Quoc V. Le (2019). XLNet: Generalized Autoregressive Pretraining for Language Understanding. https://arxiv.org/abs/1906.08237.

 

CONFERENCES

  1. ACM SIGIR Annual Conference. http://www.acm.org/sigir/
  2. ASIS&T Annual Conference. http://www.asis.org/
  3. JCDL (Joint Conference on Digital Libraries). http://www.jcdl.org/
  4. TREC (Text REtrieval Conference). http://trec.nist.gov/
  5. NTCIR, http://research.nii.ac.jp/ntcir/index-en.html
  6. WWW Annual Conference. http://www.iw3c2.org/

 

JOURNALS

D-Lib Magazine

Information Processing and Management (IP&M)

Information Research

Journal of the American Society for Information Science and Technology (JASIST)

Journal of Documentation (JDoc)

 

WEB RESOURCES

  1. ACM SIGIR Information Retrieval Resources http://www.sigir.org/resources.html
  2. 鄭卜壬、李家豪(2007)。數位典藏技術導論。第四章 資訊檢索技術http://ebook.iis.sinica.edu.tw/pdf/ch4_InformationRetrieval.pdf
  3. 中央研究院(2007)。數位典藏技術導論。http://ebook.iis.sinica.edu.tw/
  4. Glasgow IR resources (http://ir.dcs.gla.ac.uk/resources.html)
  5. UCLA Graduate School of Education & Information School (http://polaris.gseis.ucla.edu/jfurner/00-01/273/273res.html)
  6. Information Research Weblog (http://www.free-conversant.com/irweblog/)
  7. Search Engine Meeting Conference (http://www.infonortics.com/searchengines/)
  8. Web IR and IE (http://www.webir.org/)

 

版權所有 © 2024 國立臺灣師範大學