2 Bookmarks Database and Internet Robot
4 A set of classes, libraries, programs and plugins I use to manipulate my
5 bookmarks.html - check for updates, find expired URLs and so on.
7 These programs are intended to run as follows.
8 1. bkmk2db converts bookmarks.html to bookmarks.db.
9 2. check_urls (Internet robot) runs against bookmarks.db, checks every URL and
10 saves results in check.db.
11 3. db2bkmk converts bookmarks.db back to bookmarks.html.
12 Then I use this bookmarks file and...
13 4. bkmk2db converts bookmarks.html to bookmarks.db.
14 5. check_urls (Internet robot) runs against bookmarks.db, checks every URL and
15 saves results in check.db (old file copied to check.old).
16 6. (An yet unnamed program) will compare check.old with check.db and generate
17 detailed report. For example:
20 this URL is unavailable due to: host not found...
23 Oleg Broytman <phd@phdru.name>
25 COPYRIGHT and LEGAL ISSUES
26 Copyright (C) 1997-2015 PhiloSoft Design
27 All sources protected by GNU GPL. Programs are provided "as-is", without
28 any kind of warranty. All usual blah-blah-blah.
35 ------------------------------ environ ------------------------------
37 These programs use the following environment variables:
39 BKMK_STORAGE - use this storage plugin; default is pickle storage.
40 BKMK_WRITER - use this writer plugin; default is HTML writer.
41 BKMK_ROBOT - use this robot plugin; default is forking robot.
44 ------------------------------ bkmk2db ------------------------------
46 bkmk2db.py - script to convert bookmarks.html to a database.
49 bkmk2db.py [-is] [/path/to/bookmarks.html]
52 bkmk2db.py splits given file (or ./bookmarks.html) into a database
53 (using storage plugin).
57 Inhibit progress bar. Default is to display progress bar if
61 Suppress output of statistics at the end of the program. Default
62 is to write how many lines the program read and how many URLs
63 parsed. Also suppress some messages during run.
66 Aliases are not supported (yet).
69 ------------------------------ db2bkmk ------------------------------
71 db2bkmk.py - script to reconstruct bookmarks.html back from a
75 db2bkmk.py [-s] [-p prune] [-o output_file] [-t dict.db [-r]]
78 db2bkmk.py reads bookmarks.db and creates two HTML files -
82 Suppress output of statistics at the end of the program. Default is
83 to write how many records the program proceed and how many URLs
84 created. Also suppress some messages during run.
87 Prune bookmarks tree if encounter a folder with this name.
90 Put output into different file.
93 For most tasks, if someone need to process bookmarks.db in a
94 regular way (for example, replace all "gopher://gopher." with
95 "http://www."), it is easy to write special program, processing
96 every DB record. But there are cases when someone need to process
97 bookmarks.db in a non-regular way: one URL must be changed
98 in one way, another URL - in second way, etc. The -t option allows to
99 use external dictionary for such translation. The dictionary itself
100 is FLAD database, where every record have two keys - URL1 and
101 URL2. With -t option in effect, db2bkmk generates translated
102 version of bookmarks.html, where every URL1 is replaced with
103 corresponding URL2 from the translation dictionary. (See koi2win.db
104 for example of translation dictionary)
107 Reverse the effect of -t option - translate from URL2 to URL1.
110 ------------------------------ check_urls -----------------------------
112 check_urls.py - Internet robot
118 check_urls.py runs a robot plugin against every URL. Additional field
119 Error appeared in records that have not been checked by some reasons;
120 the reason is a content of Error field.
124 Inhibit progress bar. Default is to display progress bar if
128 Suppress output of statistics at the end of the program. Default is
129 to write how many records the program proceed and how many URLs
130 checked. Also suppress some messages during run.
133 Check only those URLs that has "error" mark in DB.
136 Ugly mechanism to catch welcome message from FTP server (from urllib).
139 ------------------------------ convert_st -----------------------------
141 convert_st.py - convert between storages.
144 conver_st.py [-s] new_format.
147 convert_st.py converts the database from one format to another.
151 Suppress output of statistics at the end of the program. Default is
152 to write how many records the program proceed and how many URLs
153 checked. Also suppress some messages during run.
156 ------------------------------ sort_db -----------------------------
158 sort_db.py - sort DB.
164 sort_db.py sorts the database according to one of the time
165 fields and dump sorted list of bookmarks.
169 Suppress output of statistics at the end of the program. Default is
170 to write how many records the program proceed and how many URLs
171 checked. Also suppress some messages during run.
180 Sort by last_modified.
186 ------------------------------ check_dups -----------------------------
188 check_dups.py - check duplicated URLs in the DB.
191 check_dups.py [-s] [-l logfile]
194 check_dups.py prints out a list of duplicated URLs (if any).
198 Suppress output of statistics at the end of the program. Default is
199 to write how many records the program proceed and how many URLs
200 checked. Also suppress some messages during run.
203 Save the list of dups in the logfile.
206 ------------------------------ bkmk-add -----------------------------
208 bkmk-add - add a bookmark to the DB.
211 bkmk-add [-s] [-t title] url
214 bkmk-add adds a bookmark to the DB.
218 Suppress output of statistics at the end of the program. Default is
219 to write how many records the program proceed and how many URLs
220 checked. Also suppress some messages during run.
223 Force title of the bookmark.