Sunday, September 27, 2015

Your own Broken Link Checker

Python Script : Crawl a website ,take screenshots and save it in word doc.

Steps to execute this script
  1. Create a folder 'link Checker' and place this script in it.
  2. Create a folder 'shots' and place it inside  'link Checker' folder.
  3. Change below variables 
    • url ="website url"
    • SaveDirectory=r ''screenshot directory"
  4. Execute the program.

 import requests  
 from BeautifulSoup import BeautifulSoup  
 import webbrowser  
 import os  
 import sys  
 import time  
 import Image  
 import ImageGrab  
 import win32com.client as win32  
 import os  
 import time  
 import glob  
 url = "http://www.dhatricworkspace.com"  
 response = requests.get(url)  
 # parse html  
 page = str(BeautifulSoup(response.content))  
 alllist = [];  
 httplist = [];  
 otherlist = [];  
 SaveDirectory=r'C:\Users\Giridhar\Desktop\link Checker\shots'  
 ImageEditorPath=r'C:\WINDOWS\system32\mspaint.exe'  
 def getURL(page):  
   """  
   :param page: html of web page (here: Python home page)   
   :return: urls in that page   
   """  
   start_link = page.find("a href")  
   if start_link == -1:  
     return None, 0  
   start_quote = page.find('"', start_link)  
   end_quote = page.find('"', start_quote + 1)  
   url = page[start_quote + 1: end_quote]  
   return url, end_quote  
 while True:  
   url, n = getURL(page)  
   page = page[n:]  
   if url:  
     #print url  
     alllist.append(url)  
   else:  
     break  
 #print alllist  
 for httpurl in alllist:    
   if httpurl.find("http")!=-1:  
     httplist.append(httpurl)  
   else:  
     if httpurl.endswith("/",0,1):  
       otherlist.append(httpurl)  
 #print httplist  
 #print otherlist  
 new = 0  
 i=1  
 for browsing in httplist:  
   webbrowser.get('windows-default').open(browsing,new=new)  
   time.sleep(10)  
   img=ImageGrab.grab()  
   print browsing  
   saveas=os.path.join(SaveDirectory,'ScreenShot_'+str(i)+'.jpg')  
   img.save(saveas)  
   i += 1  
   if i == 10:  
     break;  
 #---------------------------------code to move images to word-------  
 allpics = []  
 allpics=glob.glob(os.path.join(os.path.abspath("."),'shots/*.JPG'))  
 wordApp = win32.gencache.EnsureDispatch('Word.Application') #create a word application object  
 wordApp.Visible = False # hide the word application  
 doc = wordApp.Documents.Add() # create a new application  
 for pic in allpics:  
   current_pic = doc.InlineShapes.AddPicture(pic)  
   current_pic.Height= 400  
   current_pic.Width= 400  
 doc.SaveAs(os.getcwd()+'\\dhatricworkspace.docx')  
 wordApp.Application.Quit(-1)  


After execution of script ,it will automatically open the default browser and crawl all the pages it in and takes screenshot of all pages.

Once done with the screenshots it will create a doc file with all screenshots in it.



11 comments:

Green Grace said...

Your blog is really very interesting information regarding real estate.
Green Grace
Vertex Panache
Assetz 63 Degree East

Pranavi said...

Nice Blog... Thank you for sharing this information..

Provident Kenworth
Vaswani Menlo Park
Marina Skies

raji valeti said...

It's really a great and helpful piece of line.I am glad that you shared this helpful info with us.Please keep us informed like this.
Thanks for sharing.

Lanco Hills Hyderabad
sai keerthi prime Hyderabad
Ramky One Kosmos
Bella Vista Villas

Pranavi said...

Nice Blog... Thank you for sharing this information..

Provident Kenworth
Vaswani Menlo Park
Marina Skies
Prestige Jindal

Gayathri said...

Nice Blog... Thank you for sharing this information

Mahindra Lifespaces Ashvita
Incor One City
Kalpataru Residency
The Botanika
Lodha Bellezza

Unknown said...

The pressure of academic assignment can be quite a painful thing. So, if you are feeling wondering, “Who can help me in writing my papers?” then you can simply take the paper help from us at MyAssignmenthelp.com.
We boast of a 5000+ strong team of paper writers who are highly qualified and have been providing paper writing help for the past ten years. They can help you submit well-written papers and score better grades. From elementary to high school, from college to university - we provide research paper writing service for all kind of papers.

etharparker said...

Allessaywriter.com is the home of the best essay writing service in the world Is “I need help writing my essays,” the first thought on your mind? Does it cause you stress even when you’re trying to relax? Are you thinking about employing essay help? Then do we have news for you! Allessaywriter.com is here to make you an offer you cannot refuse!
For the cheapest rates, we offer you our time-tested service, which includes but is not limited to:
Guaranteed 100% plagiarism-free content
Delivery straight to your mailbox on time, every time
24/7 live online help service to clear all doubts and questions
We bring our 1500+ team of PhD holding experts to bear on every assignment. Our dedication to quality is what makes us the best essay writing service in the market.

Max Willor said...

A special thanks to My Assignment Help for providing a well-written assignment. You guys were a savior to me. Will contact you soon for my upcoming assignments. You can email us at cs@Myassignmenthelpau.Com or Phone Number: +61-2-8005-8227

Joy Brick said...

Are you worried about on-time assignment submission? Talk to our expert and complete your homework using remarkable Project Management Assignment Help
Project Management Homework Help

Alpha Assignment Help said...


The good thing about this blog is that I can relate to it myself. It is surely going to help me in my future endeavors. Keep up the good work.
Report Writing Services

john amber said...

Thank you for this wonderful information looking forward for more. We are Providing the best quality assignment help. You will find out online My Assignment Help service on firstassignmenthelp. And these services are very reliable, and they can provide information about any topic related to the academics. If you need any college level Assignment Help at reliable quality with better work. Kindly visit our website firstassignmenthelp.

Post a Comment