Monitor the difference between two files
difflib can be used to compare the differences between two file, and the lines with differences will be displayed line by line.
This module would be pretty useful to compare the configuration file of two different version for troubleshooting.
import difflib text1 = """text1: #定义字符串1 This module provides classes and functions for comparing sequences. including HTML and context and unified diffs. difflib document v7.4 add string """ text1_lines = text1.splitlines( ) #以行进行分隔, 以便进行对比 text2 = """text2: #定义字符串2 This module provides classes and functions for Comparing sequences. including HTML and context and unified diffs. difflib document v7.5""" text2_lines = text2.splitlines( ) d = difflib.Differ( ) #创建Differ( ) 对象 diff = d.compare( text1_lines, text2_lines) # 采用compare方法对字符串进行比较 print '\n'.join( list( diff) )
Output:
- text1: ? ^ + text2: ? ^ - This module provides classes and functions for comparing sequences. ? ^ + This module provides classes and functions for Comparing sequences. ? ^ including HTML and context and unified diffs. - difflib document v7.4 ? ^ + difflib document v7.5 ? ^ - add string
Meaning of Symbols :
- – : In first line but not in second line.
- + : in second line but not in first line.
- ^: The different string
- ? : some adding content exist
To create a more readable html doc:
Change the last 3 lines into:
import codecs difference_html_utf=codecs.encode(difference_html,'utf') fout=open('difference.html','wt') fout.write(str(difference_html_utf)) fout.close()
filecmp
Filecmp can compare folder, subfolder, and files, this can be useful for audit and compare backup.
Python 2.3 or later include filecmp by default.
Methods:
1. cmp: compare single file
syntax: filecmp.cmp(f1,f2[,shallow])
If file f1 and file f2 are same, the method return value True, if not same, then return False.
shallow: Default value is True, which means only use os.stat() return the file info of the two files and compare them, don’t compare the content of the files.
The File info includes last access time, modify time, time of the state change.
2. cmpfiles: compare multiple files
Syntax: filecmp.cmpfiles(dir1,dir2, common [,shallow])
This method will compare all the files under dir1 and dir2 and return three lists, which are match, does not match, and error. note that the error list will include the file that not exist , does not have read permission, or other reason .
3. dircmp: compare directory
Three report method:
- .report() : compare the target folder.
The most general report is use object.report()
import filecmp result_dir=filecmp.dircmp("dir a","dir b") print(result_dir.report())
- .report_partial_closure() : only compare the target folder and 1 level subfolder.
- .report_full_closure() : compare all the subfolders recursively.
Other property:
left
- The directory a.
right
- The directory b.
left_list
- Files and subdirectories in a, filtered by hide and ignore.
right_list
- Files and subdirectories in b, filtered by hide and ignore.
common
- Files and subdirectories in both a and b.
left_only
- Files and subdirectories only in a.
right_only
- Files and subdirectories only in b.
common_dirs
- Subdirectories in both a and b.
common_files
- Files in both a and b.
common_funny
- Names in both a and b, such that the type differs between the directories, or names for which os.stat() reports an error.
same_files
- Files which are identical in both a and b, using the class’s file comparison operator.
diff_files
- Files which are in both a and b, whose contents differ according to the class’s file comparison operator.
funny_files
- Files which are in both a and b, but could not be compared.
subdirs
- A dictionary mapping names in common_dirs to dircmp objects
E. g.
import filecmp result_dir=filecmp.dircmp("dir1","dir2") result_dir.report( ) result_dir.report_partial_closure( ) result_dir.report_full_closure( ) print ("left_list: "+ str( result_dir.left_list)) print ("right_list: "+ str( result_dir.right_list)) print ("common: "+ str( result_dir.common)) print ("left_only: "+ str( result_dir.left_only)) print ("right_only: "+ str( result_dir.right_only)) print ("common_dirs: "+ str( result_dir.common_dirs)) print ("common_files: "+ str( result_dir.common_files)) print ("common_funny: "+ str( result_dir.common_funny)) print ("same_file: "+ str( result_dir.same_files)) print ("diff_files: "+ str( result_dir.diff_files)) print ("funny_files: "+ str( result_dir.funny_files))
Output:
diff dir1 dir2 Only in dir2 : ['.DS_Store'] Identical files : ['file2.pdf'] Differing files : ['file1.doc'] diff dir1 dir2 Only in dir2 : ['.DS_Store'] Identical files : ['file2.pdf'] Differing files : ['file1.doc'] diff dir1 dir2 Only in dir2 : ['.DS_Store'] Identical files : ['file2.pdf'] Differing files : ['file1.doc'] left_list: ['file1.doc', 'file2.pdf'] right_list: ['.DS_Store', 'file1.doc', 'file2.pdf'] common: ['file1.doc', 'file2.pdf'] left_only: [] right_only: ['.DS_Store'] common_dirs: [] common_files: ['file1.doc', 'file2.pdf'] common_funny: [] same_file: ['file2.pdf'] diff_files: ['file1.doc'] funny_files: []
smtplib
#coding: utf-8 from cStringIO import StringIO from email.mime.multipart import MIMEMultipart from email.mime.text import MIMEText from email.header import Header from email import Charset from email.generator import Generator import smtplib # Example address data from_address = [u'Frank', '[email protected]'] recipient = [u'Frank too', '[email protected]'] subject = u'Unicode test' # Example body html = u'Unicode?\nTest?' text = u'Unicode?\nTest?\nTest from Frank, just ignore it' # Default encoding mode set to Quoted Printable. Acts globally! Charset.add_charset('utf-8', Charset.QP, Charset.QP, 'utf-8') # 'alternative’ MIME type – HTML and plain text bundled in one e-mail message msg = MIMEMultipart('alternative') msg['Subject'] = "%s" % Header(subject, 'utf-8') # Only descriptive part of recipient and sender shall be encoded, not the email address msg['From'] = "\"%s\" <%s>" % (Header(from_address[0], 'UTF-8'), from_address[1]) msg['To'] = "\"%s\" <%s>" % (Header(recipient[0], 'UTF-8'), recipient[1]) # Attach both parts htmlpart = MIMEText(html, 'html', 'UTF-8') textpart = MIMEText(text, 'plain', 'UTF-8') msg.attach(htmlpart) msg.attach(textpart) # Create a generator and flatten message object to 'file’ str_io = StringIO() g = Generator(str_io, False) g.flatten(msg) # str_io.getvalue() contains ready to sent message # Optionally - send it – using python's smtplib # or just use Django's s = smtplib.SMTP('smtp.gmail.com', 587) #if you don't want to encrypt the email, comment the following 3 lines out s.ehlo() s.starttls() s.ehlo() s.login("<user_name>", "<password>") s.sendmail(from_address[1], recipient[1], str_io.getvalue())
Pycurl
Before install the pycurl, you need a to install libcurl4-openssl-dev first:
sudo apt-get install libcurl4-openssl-dev
Then install the pycurl
sudo pip install pycurl
import os, sys import time import sys import pycurl URL=raw_input("Type a website address first: ") # read website from user c = pycurl.Curl() #create a Curl Object c.setopt(pycurl.URL, URL) #Set the URL c.setopt(pycurl.CONNECTTIMEOUT, 5) # set the timeout value for connection, in seconds c.setopt(pycurl.TIMEOUT, 5) # set the timeout value for request connection, in seconds c.setopt(pycurl.NOPROGRESS, 1) # don't show the progress bar, 0 means show the progress bar, other value means don't show the progress bar. c.setopt(pycurl.FORBID_REUSE, 1) #Disconnect session after the c.setopt(pycurl.MAXREDIRS, 1) #Maximum number of Http redirection c.setopt(pycurl.DNS_CACHE_TIMEOUT, 30) #Time to save the DNS cache, in seconds. indexfile = open(os.path.dirname(os.path.realpath(__file__) ) +"/content.txt", "wb") c.setopt(pycurl.WRITEHEADER, indexfile) #Save the http header to the file saved in variable indexfile. c.setopt(pycurl.WRITEDATA, indexfile) #Save the http data to the file saved in variable indexfile. try: c.perform() # send the request except Exception, e: # exception handle print("connecion error: "+str(e)) indexfile.close() c.close() sys.exit() NAMELOOKUP_TIME = c.getinfo(c.NAMELOOKUP_TIME) # get the DNS Resolving time CONNECT_TIME = c.getinfo(c.CONNECT_TIME) # Time to establish connection PRETRANSFER_TIME = c.getinfo(c.PRETRANSFER_TIME) STARTTRANSFER_TIME = c.getinfo(c.STARTTRANSFER_TIME) TOTAL_TIME = c.getinfo(c.TOTAL_TIME) HTTP_CODE = c.getinfo(c.HTTP_CODE) SIZE_DOWNLOAD = c.getinfo(c.SIZE_DOWNLOAD) HEADER_SIZE = c.getinfo(c.HEADER_SIZE) SPEED_DOWNLOAD=c.getinfo(c.SPEED_DOWNLOAD) print("HTTPstatus code: %s" %(HTTP_CODE)) print("DNS Resolving time: %.2f ms"%(NAMELOOKUP_TIME*1000)) print("Time to establish connection: %.2f ms" %(CONNECT_TIME*1000)) print("Time to Prepar for connection: %.2f ms" %(PRETRANSFER_TIME*1000)) print("Time to start the transfer: %.2f ms" %(STARTTRANSFER_TIME*1000)) print("Total transfer time: %.2f ms" %(TOTAL_TIME*1000)) print("Size downloaded: %d bytes/s" %(SIZE_DOWNLOAD)) print("HTTP header size: %d byte" %(HEADER_SIZE)) print("Average download speed: %d bytes/s" %(SPEED_DOWNLOAD)) indexfile.close() c.close()