2 Bookmarks Database and Internet Robot
4 Here is a set of classes, libraries, programs and plugins I use to
5 manipulate my bookmarks.html. I like Netscape Navigator, but I need more
6 features, so I write and maintain these programs for my needs. I need to
7 extend Navigator's "What's new" feature (Navigator 4 named it "Update
10 These programs are intended to run as follows.
11 1. bkmk2db converts bookmarks.html to bookmarks.db.
12 2. check_urls (Internet robot) runs against bookmarks.db, checks every URL and
13 saves results in check.db.
14 3. db2bkmk converts bookmarks.db back to bookmarks.html.
15 Then I use this bookmarks file and...
16 4. bkmk2db converts bookmarks.html to bookmarks.db.
17 5. check_urls (Internet robot) runs against bookmarks.db, checks every URL and
18 saves results in check.db (old file copied to check.old).
19 6. (An yet unnamed program) will compare check.old with check.db and generate
20 detailed report. For example:
23 this URL is unavailable due to: host not found...
26 Oleg Broytmann <phd@phd.pp.ru>
28 COPYRIGHT and LEGAL ISSUES
29 Copyright (C) 1997-2002 PhiloSoft Design
30 All sources protected by GNU GPL. Programs are provided "as-is", without
31 any kind of warranty. All usual blah-blah-blah.
38 ------------------------------ environ ------------------------------
40 These programs use the following environment variables:
42 BKMK_STORAGE - use this storage plugin; default is pickle storage.
43 BKMK_WRITER - use this writer plugin; default is HTML writer.
44 BKMK_ROBOT - use this robot plugin; default is forking robot.
47 ------------------------------ bkmk2db ------------------------------
49 bkmk2db.py - script to convert bookmarks.html to a database.
52 bkmk2db.py [-is] [/path/to/bookmarks.html]
55 bkmk2db.py splits given file (or ./bookmarks.html) into a database
56 (using storage plugin).
60 Inhibit progress bar. Default is to display progress bar if
64 Suppress output of statistics at the end of the program. Default
65 is to write how many lines the program read and how many URLs
66 parsed. Also suppress some messages during run.
69 Aliases are not supported (yet).
72 ------------------------------ db2bkmk ------------------------------
74 db2bkmk.py - script to reconstruct bookmarks.html back from a
78 db2bkmk.py [-s] [-p prune] [-o output_file] [-t dict.db [-r]]
81 db2bkmk.py reads bookmarks.db and creates two HTML files -
85 Suppress output of statistics at the end of the program. Default is
86 to write how many records the program proceed and how many URLs
87 created. Also suppress some messages during run.
90 Prune bookmarks tree if encounter a folder with this name.
93 Put output into different file.
96 For most tasks, if someone need to process bookmarks.db in a
97 regular way (for example, replace all "gopher://gopher." with
98 "http://www."), it is easy to write special program, processing
99 every DB record. But there are cases when someone need to process
100 bookmarks.db in a non-regular way: one URL must be changed
101 in one way, another URL - in second way, etc. The -t option allows to
102 use external dictionary for such translation. The dictionary itself
103 is FLAD database, where every record have two keys - URL1 and
104 URL2. With -t option in effect, db2bkmk generates translated
105 version of bookmarks.html, where every URL1 is replaced with
106 corresponding URL2 from the translation dictionary. (See koi2win.db
107 for example of translation dictionary)
110 Reverse the effect of -t option - translate from URL2 to URL1.
113 ------------------------------ check_urls -----------------------------
115 check_urls.py - Internet robot
121 check_urls.py runs a robot plugin against every URL. Additional field
122 Error appeared in records that have not been checked by some reasons;
123 the reason is a content of Error field.
127 Inhibit progress bar. Default is to display progress bar if
131 Suppress output of statistics at the end of the program. Default is
132 to write how many records the program proceed and how many URLs
133 checked. Also suppress some messages during run.
136 Check only those URLs that has "error" mark in DB.
139 Ugly mechanism to catch welcome message from FTP server (from urllib).
142 ------------------------------ convert_st -----------------------------
144 convert_st.py - convert between storages.
147 conver_st.py [-s] new_format.
150 convert_st.py converts the database from one format to another.
154 Suppress output of statistics at the end of the program. Default is
155 to write how many records the program proceed and how many URLs
156 checked. Also suppress some messages during run.
159 ------------------------------ sort_db -----------------------------
161 sort_db.py - sort DB.
167 sort_db.py sorts the database according to one of the time
168 fields and dump sorted list of bookmarks.
172 Suppress output of statistics at the end of the program. Default is
173 to write how many records the program proceed and how many URLs
174 checked. Also suppress some messages during run.
183 Sort by last_modified.
189 ------------------------------ check_dups -----------------------------
191 check_dups.py - check duplicated URLs in the DB.
194 check_dups.py [-s] [-l logfile]
197 check_dups.py prints out a list of duplicated URLs (if any).
201 Suppress output of statistics at the end of the program. Default is
202 to write how many records the program proceed and how many URLs
203 checked. Also suppress some messages during run.
206 Save the list of dups in the logfile.
209 ------------------------------ bkmk-add -----------------------------
211 bkmk-add - add a bookmark to the DB.
214 bkmk-add [-s] [-t title] url
217 bkmk-add adds a bookmark to the DB.
221 Suppress output of statistics at the end of the program. Default is
222 to write how many records the program proceed and how many URLs
223 checked. Also suppress some messages during run.
226 Force title of the bookmark.