中大機構典藏-NCU Institutional Repository-提供博碩士論文、考古題、期刊論文、研究計畫等下載:Item 987654321/84087
English  |  正體中文  |  简体中文  |  全文笔数/总笔数 : 78937/78937 (100%)
造访人次 : 39424457      在线人数 : 438
RC Version 7.0 © Powered By DSPACE, MIT. Enhanced by NTU Library IR team.
搜寻范围 查询小技巧:
  • 您可在西文检索词汇前后加上"双引号",以获取较精准的检索结果
  • 若欲以作者姓名搜寻,建议至进阶搜寻限定作者字段,可获得较完整数据
  • 进阶搜寻


    jsp.display-item.identifier=請使用永久網址來引用或連結此文件: http://ir.lib.ncu.edu.tw/handle/987654321/84087


    题名: 新聞導言之智能生成;Intelligent generation of news lead
    作者: 鍾文翔;Zhong, Wen-Xiang
    贡献者: 資訊管理學系
    关键词: 新聞導言;事件提取;5w1h;TextRank;Word2Vec;News introduction;Event extraction;5w1h;TextRank;Word2Vec
    日期: 2020-08-24
    上传时间: 2020-09-02 18:02:49 (UTC+8)
    出版者: 國立中央大學
    摘要: 新聞導言是新聞內容中相當重要的一部分,導言處在新聞的開頭,以最簡練的文字寫出文章中的重點內容,吸引讀者繼續看完整篇報導,導言主要可分為硬式新聞導言及軟式新聞導言兩種大類,硬式導言的內容通常包含新聞的何時(when)、何地(where)、何事(what)、何人(who)、為何(why)、如何(how),簡稱5w1h,要求在簡短的篇幅盡可能描述新文的主體;軟式導言則偏向使用新奇、懸疑的手法來吸引讀者興趣。但目前的自然語言處理任務中,生成新聞標題、新聞摘要的相關研究相當多,自動產生導言的研究卻較少。
    本研究主要在建立一套自動產生導言的框架,從導言本身的寫作手法和要素去分析,利用TextRank結合Word2Vec與句子位置、句子長度、標題重疊率去辨識新聞關鍵事件,取得主題句子集合,再將句子集合去進行詞性標注、命名實體、語義角色標注等方式來抽取新聞5w1h要素,然後分別產生硬式新聞導言和軟式新聞導言。
    硬式新聞導言抽取七種常見硬式新聞導言類型,即敘事式、描寫式、引語式、描寫式、提問式、評議式、結論式、對比式的特徵,例如:研究結果、地點描述、提問、引用句等,最後將5w1h要素及導言特徵兩者結合去產生硬式新聞導言。軟式新聞導言的部分,使用隱藏5w1h要素的句法來產生懸疑手法,成功產生了軟式新聞導言。
    依照這些方式,本研究產生了硬式新聞導言及軟式新聞導言,確保產生的新聞導言包含足夠的新聞重點資訊,且能依使用者需求產生不同類型的導言。
    本研究除了能幫助使用者減少撰寫導言的人力及時間需求,更使產生出來的導言有著多樣的寫作風格,可依照使用者的需求做改變,產生的導言也能讓讀者快速瞭解到新聞資訊。
    ;The news lead is a very important part of news content. The lead is at the beginning of the news, that is writes key content of the article in the most concise text to attract readers to read the entire report. The lead is written in many ways, but usually contains when, where, who, what, why, how in the news 5w1h information In natural language processing tasks, there are a lot of research is on generate headlines and summaries, but there are little research is on automatic lead.
    This research is mainly to establish a framework for automatically generating leads, the writing techniques and elements of the lead are analyzed by using TextRank and Word2Vec with sentence position, sentence length, and title overlap rate to identify key events in news to obtain a set of topic sentences. Then the sentence collections are used for pos tagging, named entity tagging, semantic role tagging and other methods to extract 5w1h elements in news, and then to generate hard news lead and soft news lead respectively.
    Seven common hard news lead types combined extract from hard news lead, and the 5w1h elements and the features of the lead are finally combined to produce a hard news lead. The introduction of soft news uses the syntax of hiding 5w1h elements to generate the lead of soft news.
    According to these methods, this research has produced hard news introduction and soft news introduction, ensuring that the news introduction generated contains enough key news information and can generate different types of introduction according to the needs of users.
    This research not only helps users reduce the manpower and time requirements for writing lead, but also makes the generated lead have a variety of writing styles, which can be changed according to the needs of users. The generated lead can also allow readers to quickly understand news information.
    显示于类别:[資訊管理研究所] 博碩士論文

    文件中的档案:

    档案 描述 大小格式浏览次数
    index.html0KbHTML323检视/开启


    在NCUIR中所有的数据项都受到原著作权保护.

    社群 sharing

    ::: Copyright National Central University. | 國立中央大學圖書館版權所有 | 收藏本站 | 設為首頁 | 最佳瀏覽畫面: 1024*768 | 建站日期:8-24-2009 :::
    DSpace Software Copyright © 2002-2004  MIT &  Hewlett-Packard  /   Enhanced by   NTU Library IR team Copyright ©   - 隱私權政策聲明