经常会用到,比如过滤已生成过的关键词(包含关系),相等关系直接用编辑器去重就好,推荐编辑器notepad++、sublime;提取包含词根的关键词等等,不区分大小写
ok.txt,存放包含关键词的词根
key.txt,词根
keyword.txt,关键词
ok = open('ok.txt','w',encoding='utf8') with open('keyword.txt',encoding="utf8") as wordlist: for keyword in wordlist: keyword = keyword.strip('\n') # kw,kurl = keyword.split('\t') with open('key.txt',encoding="utf8") as keylist: for k in keylist: k = k.strip('\n') if k.upper() in keyword.upper(): ok.write(k+'\n') break