玩蛇网提供最新Python编程技术信息以及Python资源下载!

求Python统计英文文件内单词个数的思路

感谢微博上@刘鑫-MarsLiu的TAG每天一个小程序。 你会如何实现上述题目的要求?

#!/usr/bin/env python  
# -*- coding: utf-8 -*-  

""" 
python实现任一个英文的纯文本文件,统计其中的单词出现的个数、行数、字符数 
"""  

file_name = "movie.txt"  

line_counts = 0  
word_counts = 0  
character_counts = 0  

with open(file_name, 'r') as f:  
    for line in f:  
        words = line.split()  

        line_counts += 1  
        word_counts += len(words)  
        character_counts += len(line)  

print "line_counts ", line_counts  
print "word_counts ", word_counts  
print "character_counts ", character_counts

以上代码,有哪些改进的地方?如何改进才更加pythonic?

python有1个collections库可以解决你这个问题

#!/usr/bin/python

# 这么着,您看您乐意不?

import re

file_name = 'test.txt'

lines_count = 0
words_count = 0
chars_count = 0
words_dict  = {}
lines_list   = []

with open(file_name, 'r') as f:
    for line in f:
        lines_count = lines_count + 1
        chars_count  = chars_count + len(line)
        match = re.findall(r'[^a-zA-Z0-9]+', line)
        for i in match:
            # 只要英文单词,删掉其他字符
            line = line.replace(i, ' ')
        lines_list = line.split()
        for i in lines_list:
            if i not in words_dict:
                words_dict[i] = 1
            else:
                words_dict[i] = words_dict[i] + 1

print 'words_count is', len(words_dict)
print 'lines_count is', lines_count
print 'chars_count is', chars_count

for k,v in words_dict.items():
    print k,v

玩蛇网文章,转载请注明出处和文章网址:https://www.iplaypy.com/wenda/wd20162.html

相关文章 Recommend

玩蛇网Python互助QQ群,欢迎加入-->: 106381465 玩蛇网Python新手群
修订日期:2017年05月22日 - 11时49分52秒 发布自玩蛇网

您现在的位置: 玩蛇网首页 > Python问题解答 > 正文内容
我要分享到:

必知PYTHON教程 Must Know PYTHON Tutorials

必知PYTHON模块 Must Know PYTHON Modules