Bookmarks Database and Internet Robot A set of classes, libraries, programs and plugins I use to manipulate my bookmarks.html - check for updates, find expired URLs and so on. These programs are intended to run as follows. 1. bkmk2db converts bookmarks.html to bookmarks.db. 2. check_urls (Internet robot) runs against bookmarks.db, checks every URL and saves results in check.db. 3. db2bkmk converts bookmarks.db back to bookmarks.html. Then I use this bookmarks file and... 4. bkmk2db converts bookmarks.html to bookmarks.db. 5. check_urls (Internet robot) runs against bookmarks.db, checks every URL and saves results in check.db (old file copied to check.old). 6. (An yet unnamed program) will compare check.old with check.db and generate detailed report. For example: this URL is unchanged this URL is changed this URL is unavailable due to: host not found... AUTHOR Oleg Broytman COPYRIGHT and LEGAL ISSUES Copyright (C) 1997-2014 PhiloSoft Design All sources protected by GNU GPL. Programs are provided "as-is", without any kind of warranty. All usual blah-blah-blah. #include LICENSE GPL ------------------------------ environ ------------------------------ These programs use the following environment variables: BKMK_STORAGE - use this storage plugin; default is pickle storage. BKMK_WRITER - use this writer plugin; default is HTML writer. BKMK_ROBOT - use this robot plugin; default is forking robot. ------------------------------ bkmk2db ------------------------------ NAME bkmk2db.py - script to convert bookmarks.html to a database. SYNOPSIS bkmk2db.py [-is] [/path/to/bookmarks.html] DESCRIPTION bkmk2db.py splits given file (or ./bookmarks.html) into a database (using storage plugin). Options: -i Inhibit progress bar. Default is to display progress bar if stderr.isatty() -s Suppress output of statistics at the end of the program. Default is to write how many lines the program read and how many URLs parsed. Also suppress some messages during run. BUGS Aliases are not supported (yet). ------------------------------ db2bkmk ------------------------------ NAME db2bkmk.py - script to reconstruct bookmarks.html back from a database. SYNOPSIS db2bkmk.py [-s] [-p prune] [-o output_file] [-t dict.db [-r]] DESCRIPTION db2bkmk.py reads bookmarks.db and creates two HTML files - Options: -s Suppress output of statistics at the end of the program. Default is to write how many records the program proceed and how many URLs created. Also suppress some messages during run. -p prune Prune bookmarks tree if encounter a folder with this name. -o output_file Put output into different file. -t dict.db For most tasks, if someone need to process bookmarks.db in a regular way (for example, replace all "gopher://gopher." with "http://www."), it is easy to write special program, processing every DB record. But there are cases when someone need to process bookmarks.db in a non-regular way: one URL must be changed in one way, another URL - in second way, etc. The -t option allows to use external dictionary for such translation. The dictionary itself is FLAD database, where every record have two keys - URL1 and URL2. With -t option in effect, db2bkmk generates translated version of bookmarks.html, where every URL1 is replaced with corresponding URL2 from the translation dictionary. (See koi2win.db for example of translation dictionary) -r Reverse the effect of -t option - translate from URL2 to URL1. ------------------------------ check_urls ----------------------------- NAME check_urls.py - Internet robot SYNOPSIS check_urls.py [-ise] DESCRIPTION check_urls.py runs a robot plugin against every URL. Additional field Error appeared in records that have not been checked by some reasons; the reason is a content of Error field. Options: -i Inhibit progress bar. Default is to display progress bar if stderr.isatty() -s Suppress output of statistics at the end of the program. Default is to write how many records the program proceed and how many URLs checked. Also suppress some messages during run. -e Check only those URLs that has "error" mark in DB. BUGS Ugly mechanism to catch welcome message from FTP server (from urllib). ------------------------------ convert_st ----------------------------- NAME convert_st.py - convert between storages. SYNOPSIS conver_st.py [-s] new_format. DESCRIPTION convert_st.py converts the database from one format to another. Options: -s Suppress output of statistics at the end of the program. Default is to write how many records the program proceed and how many URLs checked. Also suppress some messages during run. ------------------------------ sort_db ----------------------------- NAME sort_db.py - sort DB. SYNOPSIS sort_db.py [-savmr] DESCRIPTION sort_db.py sorts the database according to one of the time fields and dump sorted list of bookmarks. Options: -s Suppress output of statistics at the end of the program. Default is to write how many records the program proceed and how many URLs checked. Also suppress some messages during run. -a Sort by add_date. -v Sort by last_visit. -m Sort by last_modified. -r Reverse sort. ------------------------------ check_dups ----------------------------- NAME check_dups.py - check duplicated URLs in the DB. SYNOPSIS check_dups.py [-s] [-l logfile] DESCRIPTION check_dups.py prints out a list of duplicated URLs (if any). Options: -s Suppress output of statistics at the end of the program. Default is to write how many records the program proceed and how many URLs checked. Also suppress some messages during run. -l logfile Save the list of dups in the logfile. ------------------------------ bkmk-add ----------------------------- NAME bkmk-add - add a bookmark to the DB. SYNOPSIS bkmk-add [-s] [-t title] url DESCRIPTION bkmk-add adds a bookmark to the DB. Options: -s Suppress output of statistics at the end of the program. Default is to write how many records the program proceed and how many URLs checked. Also suppress some messages during run. -t title Force title of the bookmark.